legumeinfo / gcv

Federating genomes with love (and synteny derived from functional annotations)
https://gcv.legumeinfo.org/
Apache License 2.0
42 stars 10 forks source link

Minimal documentation #99

Closed abretaud closed 6 years ago

abretaud commented 7 years ago

Hi, I'm very interested in using this application, and I'm trying to make a docker image for it (I'll share it). I have trouble understanding how to run a fully working version of the application, I think I managed to launch the server, but I'm not sure what to do with the client code. I also don't understand how to load data into the chado database: is there some loading scripts, or maybe can it use a tripal db? Some (even minimal) documentation on these points would be greatly appreciated!

adf-ncgr commented 7 years ago

Hi- very glad to hear of your interest on this tool (which can play nicely with the phylotree module you were inquiring about at the other repository); great to hear that you're working on the dockerization too, and would be happy to work with you on that front. With respect to documentation, we are currently in process of helping another one of our collaborators ( @vivekkrish ) get up and running with the application and we're going to try to use this occasion to motivate some documentation, so maybe we can all work together on fleshing out the holes in what is currently on our github wiki pages (very skeletal and probably out of date too!)

With respect to the loading of data, it largely based simply on standard chado gene annotation representation, but you will need to have some strategy for gene family assignments. This can be based on membership in trees, or it can be just based on matching the genes against HMMs from some gene family definition. Depending on what you have in mind for your site, the approach for loading will be slightly different. Our legumeinfo.org site is a bit of a hybrid, where some species are included in the trees and other species whose genomes came along after the trees were built have assignments done via the HMM approach. We can try to get documentation for both of these, but in terms of the work with @vivekkrish it is largely focused on the HMM approach.

For the client code, you'll need to use npm install to get all the dependencies, then npm run build will do compilation/bundling. After that configuring your webserver to have a URL that points into the dist directory should pick up everything needed; I can share an apache config with you if that would help, I may be also forgetting some subtleties. There is also a development-oriented webserver that can be started using npm run start, which will launch it at localhost:3000 (by default).

Also, if you'd like some example data we could make a small postgres dump of one of our databases that you could use to explore.

Keep us posted of your progress, and we'll be happy to incorporate lessons learned into the nascent documentation effort...

abretaud commented 7 years ago

Thanks, and sorry for my slow response here too! I didn't realised you were the same author(s) as https://github.com/legumeinfo/tripal_phylotree/

With the instructions you gave me I managed to make a first docker image, I'll make it available soon (once I manage to load data).

I'm not 100% sure what I will use as strategy for gene family assignement, possibly test a few different ones, starting with OrthoFinder frst I guess.

I'll let you know when I have more news on this, and no problem to help on documentation

abretaud commented 6 years ago

Ok, I'm continuing my experiments with orthology stuff! The docker image I've made is online now: https://github.com/abretaud/docker-lis-gcv and https://quay.io/repository/abretaud/lis-gcv

I have an error when using a chado database where I loaded a tree with the perl script https://github.com/legumeinfo/tripal_phylotree/blob/lis_master/scripts/gmod_load_tree.pl

It says:

Internal Server Error: /services/v1/gene-to-query-track/
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/django/core/handlers/base.py", line 132, in get_response
    response = wrapped_callback(request, *callback_args, **callback_kwargs)
  File "/usr/local/lib/python2.7/dist-packages/django/views/decorators/csrf.py", line 58, in wrapped_view
    return view_func(*args, **kwargs)
  File "/opt/gcv/server/services/views.py", line 22, in wrapper
    response = view(request, *args, **kwargs)
  File "/opt/gcv/server/services/views.py", line 259, in v1_gene_to_query_track
    focus_order = list(GeneOrder.objects.filter(gene=focus))
  File "/usr/local/lib/python2.7/dist-packages/django/db/models/query.py", line 162, in __iter__
    self._fetch_all()
  File "/usr/local/lib/python2.7/dist-packages/django/db/models/query.py", line 965, in _fetch_all
    self._result_cache = list(self.iterator())
  File "/usr/local/lib/python2.7/dist-packages/django/db/models/query.py", line 238, in iterator
    results = compiler.execute_sql()
  File "/usr/local/lib/python2.7/dist-packages/django/db/models/sql/compiler.py", line 840, in execute_sql
    cursor.execute(sql, params)
  File "/usr/local/lib/python2.7/dist-packages/django/db/backends/utils.py", line 64, in execute
    return self.cursor.execute(sql, params)
  File "/usr/local/lib/python2.7/dist-packages/django/db/utils.py", line 98, in __exit__
    six.reraise(dj_exc_type, dj_exc_value, traceback)
  File "/usr/local/lib/python2.7/dist-packages/django/db/backends/utils.py", line 64, in execute
    return self.cursor.execute(sql, params)
ProgrammingError: relation "gene_order" does not exist
LINE 1: ...gene_order"."gene_id", "gene_order"."number" FROM "gene_orde...

So it seems like a table is missing: how can I create it and populate it?

adf-ncgr commented 6 years ago

sorry for the slow reply (and sorrier for the continued lack of documentation). The script you want to use to create and populate that gene_order tables is here (note the branch "JCVI"): https://github.com/legumeinfo/lis_context_viewer/blob/JCVI/scripts/perl/gmod_gene_ordering.pl

I thought we had merged this branch into master already, but looks like not yet. I should check with @alancleary that it is safe to do so before I charge ahead, but I think you can try using it from that branch anyway if you get back to trying to make it work before we've done so.

adf-ncgr commented 6 years ago

OK, the loader scripts have now been merged to master (made a few minor tweaks to make the names given to custom "controlled vocabulary" a little less specific to our project, though still app-specific). use : https://github.com/legumeinfo/lis_context_viewer/tree/master/scripts/perl/gmod_gene_ordering.pl when you are ready and let us know if it doesn't work. still to do is the documentation on the wiki that is promised in the scripts/README.md (which I will really try to do soon so I can close this issue!)

adf-ncgr commented 6 years ago

OK, I've fleshed out the page here: https://github.com/legumeinfo/lis_context_viewer/wiki/Configuring-and-Loading-Chado

will close this issue, but feel free to re-open or start a new one if you find it still too minimal...