galaxy-genome-annotation / python-apollo

Python library for talking to Apollo API
MIT License
11 stars 11 forks source link

Am I doing this wrong? #57

Open jaspwn opened 1 year ago

jaspwn commented 1 year ago

Hi,

Firstly sorry for the long post...

I really like GMOD/apollo and I would like to be able to interact/manipulate it with python-apollo and/or arrow CLI but I cannot get it to work consistently and I wonder if I am doing it all wrong entirely.

I have set up a persistent instance of Apollo using Docker on a desktop machine where I have an empty writable folder for the jbrowse data.

Then on my laptop I can use arrow init to find the server and was originally only using arrow annotations load_gff to bulk add annotations. This worked relatively well (although it would choose some odd reading frames/start codons) until I then wanted to do another bulk upload.

I thought that it would just merge annotations that already existed but it duplicated them, so I tried to use arrow organisms delete_features didnt work on organisms where I had performed to seperate annotations load_gff so it was either delete each annotation manually through the browser (oofff) or start again.

In starting again I was hoping to use python-apollo and/or arrow CLI to do this programatically but it never seemed to upload files to the correct place and using arrow remote add_organisms or arrow remote add_track produced a mixture of results but mostly failures.

Basically I am a little clueless about running an apollo/jbrowse server and I wonder if what I am doing is out of scope or just plain wrong.

I wonder if anyone is willing to spend a some time with me over zoom or email where I can detail more what I am trying to do and guide me me a little?

Thanks for just reading and getting this far :-)

abretaud commented 1 year ago

Hi! Cool to see a new user :)

it would just merge annotations that already existed but it duplicated them

Yes there's no merging possible in apollo currently, load_gff will just add new features next to existing ones.

organisms delete_features didnt work

did you get a specific error?

In starting again I was hoping to use python-apollo and/or arrow CLI to do this programatically but it never seemed to upload files to the correct place and using arrow remote add_organisms or arrow remote add_track produced a mixture of results but mostly failures.

There's surely a way to make it work, we use it regularly here. Remote mode can be affected by timeout when it uploads content as big archives.

If you have any specific error message it would help I think. And also any details on how you deployed your apollo instance (docker-compose.yml maybe? access through a reverse proxy or directly?)

Not much time for a zoom in the next days, but we can discuss here I guess

jaspwn commented 1 year ago

Hi, thanks for the response!

Unfortunately for the delete_features I didnt not the error exactly and have since started again from scratch so I am unlikely to be able to reproduce but will take note next time.

So when trying to add an organism using remote.

arrow remote add_organism --blatdb genome.2bit name GenomeFasta.tar.gz

I get error

{ "error": "/data/temporary/apollo_data/27482-name/trackList.json (No such file or directory)" }

Similiary when trying to use remote add_track. I have noticed that the tarballs are ending up in the root directory of the docker container and it looks like the files are maybe not being untarred correctly or to the right locations? Sometimes a directory is created with the correct name but it is empty.

I dont use a docker-compose.yml (or know where to find it) but this is the docker command I use to set up the container.

docker run --memory=8g -it -v /Users/user/Desktop/apollo/jbrowse/root/directory/:/data -v /Users/user/Desktop/apollo/postgres/data/directory:/var/lib/postgresql -v /Users/user/Desktop/apollo/jbrowse/root/apollo_data:/data/temporary/apollo_data -e APOLLO_ADMIN_EMAIL=admin -e APOLLO_ADMIN_PASSWORD=password -p 8888:8080 gmod/apollo:latest

The docker container is running on a networked computer so then I just connect using its IP and port. I hope that is enough information, im not so good with networking (:

Thanks!