Open hexylena opened 8 years ago
Just to clarify . . you want (from Chado land) datastructres and webservices that support:
I had added metadata for Organism previously to support (hopefuly) JSON and it makes sense to do the same for Sequence.
My big question mark would be whether we want to add the more formal Chado versions of these or just leave them in JSON-land. The former would definitely make Chado export / import easier.
@nathandunn yep, that sounds like what I want, support for that metadata so that my users can annotate it and query on it.
I'm not sure what you mean w/r/t more formal or not. The key point for us being that they're foreign keyed on the cvterm table so I can restrict the tags that the annotators use.
I think that sounds good. It might be awhile until we have time to work on it, though I don’t think it will be that bad. It should follow the same pattern as FeatureCVTerm.
I think that the trickier part is figuring out now you want users to annotate the Organism / Sequences within the database. We can link up to Chado, but we aren’t explicitly pulling anything in (yet).
On Apr 11, 2016, at 2:24 PM, Eric Rasche notifications@github.com wrote:
@nathandunn https://github.com/nathandunn yep, that sounds like what I want, support for that metadata so that my users can annotate it and query on it.
I'm not sure what you mean w/r/t more formal or not. The key point for us being that they're foreign keyed on the cvterm table so I can restrict the tags that the annotators use.
— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/GMOD/Apollo/issues/981#issuecomment-208568942
I started a database migration script to start moving towards using OrganismProperties more
https://github.com/cmdcolin/Apollo/commit/f3c1d842bfde8b3e631efb321eac69ff2431a5a7
It moves columns like blatdb and directory to properties since I think that these properties are not intrinsic to an organism and should be a property :)
It also tries to assert uniqueness on name to fix #990
@erasche This would be easy to do. Let me know if you need any help doing it and I can point you in the right direction.
Just throwing ideas out here, but I wanted to at least write them down and maybe other people would have useful comments.
Now that we're putting increasingly large numbers of genomes in Apollo, data retrieval is coming up as a (relatively minor) pain point. I was considering how I might ease this process, and the operations which would occur often for us. Those operations are largely "fetch all genomes matching criteria X."
This has a natural representation in the organismprop / organism_dbxref / feature_cvterms* tables in Chado. They're metadata about organisms / genomes / chromosomes which would be useful if they could be queried against.
* = for landmark features only
It would be nice if...
If this data was available, I could imagine the API supporting filtering, so I could make requests like:
Anyway. Just an idea.