dib-lab / dammit

just annotate it, dammit!
http://dib-lab.github.io/dammit/
Other
88 stars 28 forks source link

curl: (28) Failed to connect to cegg.unige.ch port 21: Connection timed out #229

Open hoelzer opened 3 years ago

hoelzer commented 3 years ago

Hi!

I just tried to reinstall the databases which fails in the step

curl -o aa_seq_euk.fasta.gz ftp://cegg.unige.ch/OrthoDB8/Eukaryotes/FASTA/aa_seq_euk.fasta.gz
curl: (28) Failed to connect to cegg.unige.ch port 21: Connection timed out

Maybe the URL is not available anymore?

josruirod commented 3 years ago

I'm facing the same. I want to try dammit, but it seems I cannot install it due to this. I have also looked for aa_seq_euk.fasta online, unsuccesfully. Could you fix this? Thanks

hoelzer commented 3 years ago

I just uploaded databases needed by dammit here:

https://zenodo.org/record/5036558#.YO72G4QzaV4

https://zenodo.org/api/files/773602b2-e239-4489-8f97-cebb99f36a81/dbs.tar.gz

With that, you can directly start with the annotation process. However, attention bc/ I downloaded the databases for a fungi annotation so the included BUSCO database is fungi specific!

shrhops commented 2 years ago

Hi, I'm trying to unzip the databases uploaded at this link, and with both of them (Dbs and odb10v1_all_fasta.tab), I get the error "invalid compressed data--format violated". The dbs directory does contain some of the necessary files, but not all. Of course, I can't proceed with annotation either. Also, I already have the BUSCO database for eukaryotes, would I just add that directory into the dbs one?

Could you help with these issues? Thanks

hoelzer commented 2 years ago

Hi @shrhops I am also currently working with the uploaded database again. Let's see, if I also run into that issue I can package and upload it again. Maybe there was some file corruption while packaging or uploading to Zenodo.

hoelzer commented 2 years ago

@shrhops for me it works. I just did (copied commands from a nextflow pipeline, so maybe needs slight adjustments):

wget https://zenodo.org/api/files/773602b2-e239-4489-8f97-cebb99f36a81/dbs.tar.gz

tar zxvf dbs.tar.gz
dammit annotate $FASTA --database-dir $PWD/dbs --busco-group fungi -n foobar -o out --n_threads 8 

I run dammit in a docker container: docker pull nanozoo/dammit:1.2--b47259e

shrhops commented 2 years ago

@hoelzer I installed dammit via conda, maybe that's the issue? Nevertheless, I'm now retrying with the same steps as you have shown above. By the way, do you know if I can just move my BUSCO files (for eukaryotes) into the dbs directory? Would that work?

hoelzer commented 2 years ago

Maybe it also has to do with the conda vs docker running dammit.

Sorry, no clue if you can just replace the files but I would think so. I would be really interested as well if this works! Bc then the db dump provided via zenodo could be also used for non fungi (it can be anyway, but the Busco part would then be also correct)

shrhops commented 2 years ago

~Ah it looks like I'm having the same problem as before. The only databases I'm missing, though, are OrthoDB and sprot. I haven't been able to find where to download them - do you know where I can? I was going to try just downloading those two and adding them to the database directory that dammit has already installed.~

I take it back; it works now, for some reason!

rix133 commented 2 years ago

if you want to use the latest version of orthoDB you can consider also using a fork where I implemented it https://github.com/rix133/dammit/tree/OrthoDBv10

hoelzer commented 2 years ago

@shrhops are you also able to just replace the BUSCO fungi files w/ some other BUSCO files and then the db still works w/ dammit?

shrhops commented 2 years ago

@hoelzer actually I tried running dammit all day yesterday, and it took really long over the transdecoder part. At some point my VPN timed out so the whole process failed, so unfortunately I don't have any results yet. On the other hand, it was running, even if it didn't produce results, so I would guess that it does work?

hoelzer commented 2 years ago

@shrhops did it finally run? I currently also have issues w/ TransDecoder. My input is ~8k genes and TransDecoder always breaks w/ some unclear error message.

hoelzer commented 2 years ago

fyi: I was able to solve my issue with transdecoder by changing

https://github.com/dib-lab/dammit/blob/master/dammit/config.json#L37

to

    "transdecoder": {
        "longorfs":["-m 80"],
        "predict": ["--no_refine_starts"]
    }

Otherwise, I had an error bc/ something during the TransDecoder.Predict refinement step failed based on my input data

shrhops commented 2 years ago

@hoelzer no, sorry, since I ran out of space, I haven't been able to run dammit again. Although it stopped failing at the TransDecoder step eventually, but I didn't make the change you mentioned.