hexylena / groene-thee

AngularJS + REST based Tripal alternative.
0 stars 0 forks source link

Roadmap #1

Open hexylena opened 8 years ago

hexylena commented 8 years ago
abretaud commented 8 years ago

Do you have plans on the data loading into chado? I mean for data that don't come from apollo as for them there is now the galaxy/apollo stuff. There are the gmod perl scripts, but I must admit I'm not very fond of them!

hexylena commented 8 years ago

My plans for data loading are to make those perl scripts available as Galaxy tools. Any scripts in particular?

I'm waiting to find a bit of time to do the galaxy-side portion first (we just need a generic, user-configurable key-value store with predictable parameter names so tools can say "hey I can accept values for apollo.username, apollo.host, apollo.password" and then let the user configure those and re-use across compatible tools) before starting on wrapping those scripts.

hexylena commented 8 years ago

Also, any ideas which scripts/what sort of data you will wish to load?

abretaud commented 8 years ago

The data we usually load is:

I had some troubles with gmod_bulk_load_gff3.pl especially when loading annotation (some features duplicated for unknown reasons, automatically created peptide features with random names, ...), and I ended up writing little scripts to fix the data once loaded, rather than fixing it upstream (bad boy!)...

hexylena commented 8 years ago

Ok how is this:

data loader
genome apollo
annotations apollo, bulk_load
blast results bulk_load (blast2gapped gff3 might be useful, would love to see your custom blast GFF3 though, everyone has interesting approaches to this ;))
interpro bulk_load
blast2go ??? (is this something we can do in bulk load? Would you be interested in sharing your script? Maybe it is generally applicable and we can make more useful tools for people? Hmmm.)
ontology I think we'll have to wrap the xort/ontology loading equipment. I'm a bit concerned about doing it though, need to think a bit more on how to deal with these DBs.
hexylena commented 8 years ago

And yes, similar experiences with bulk_load doing strange things...very similar.

abretaud commented 8 years ago

Yes, it could be something like this For blast2go, I have to check, but IIRC the loading method should be reworked because it doesn't work well with multiple analyses. For ontology, well it's orthology in fact ;) ie gene similarities between different species. This script is a little bit too simple, I think we would move to something more like https://github.com/legumeinfo/tripal_phylotree and http://gmod.org/wiki/Chado_Phylogeny_Module to store the data Anyway, no problem to share these scripts of course, I just need to find some time to a have a look at them first

hexylena commented 8 years ago

blast2go

ok, sounds good.

orthology

ahhh interesting. I considered that it might be a typo, since chado people seem so often more concerned with such. Interesting. :)