Gaius-Augustus / BRAKER

BRAKER is a pipeline for fully automated prediction of protein coding gene structures with GeneMark-ES/ET/EP/ETP and AUGUSTUS in novel eukaryotic genomes
Other
334 stars 80 forks source link

compleasm_to_hints step failing in each test #785

Closed Gequris closed 3 months ago

Gequris commented 3 months ago

Hello, Dr. Hoff Thanks for your tool Unfortunately, I'm not able to successfully end each of 3 test I run latest version of BRAKER on HPC cluster, using Singularty v3.7.1 (also i tried with v3.8.6 and it didnt show any difference) In every test the following error is occurring: Error in file /opt/Augustus/scripts/compleasm_to_hints.py at line 49: Return code of subprocess was 1['/opt/compleasm_kit/compleasm.py', 'run', '-l', 'eukaryota_odb10', '-a', '/mnt/tank/scratch/gmaslakov/annotation/breaker/test3/genome.fa', '-t', '8', '-o', 'compleasm_genome_out'] I start every test using test*.sh file only adding the AUGUSTUS_CONFIG_PATH to the directory which i copied from the braker.sif Hope you can help me with this problem. Thanks in advance Also, I attached an archive with the tests results tests_compleasm.zip UPD: Either problem I have whan running tests on my local machine in the exactley same way I would really appreciate your help !

KatharinaHoff commented 3 months ago

There was a new compleasm release that I had not tested, last week. However, I don't think the release itself is your problem, since that is running fine on my machine.

Your zip file (I only looked at test1) contained an interrupted download in the mb_downloads folder. After I deleted that temporary file, I was able to run the latest compleasm, it finished with exit code 0.

Gequris commented 3 months ago

There was a new compleasm release that I had not tested, last week. However, I don't think the release itself is your problem, since that is running fine on my machine.

Your zip file (I only looked at test1) contained an interrupted download in the mb_downloads folder. After I deleted that temporary file, I was able to run the latest compleasm, it finished with exit code 0.

Thank you for the answer, I think this is the issue. The same thing happend in two other tests But after every run of test1.sh the temp file reappears Can I ask you if the mb_download folder and the temp file are being created during test run and how can i get rid of them to sucessfully complete the tests ?

KatharinaHoff commented 3 months ago

I do not understand why it works outside of BRAKER but not inside of BRAKER. Have you tried re-running the BRAKER jobs? Maybe you had temporary network problem when you ran it the first time? I have now run your command with your data on 3 different machines, always getting return code 0, always getting the expected output. I don't see how I can fix something on this problem because I cannot reproduce it.

Gequris commented 3 months ago

I do not understand why it works outside of BRAKER but not inside of BRAKER. Have you tried re-running the BRAKER jobs? Maybe you had temporary network problem when you ran it the first time? I have now run your command with your data on 3 different machines, always getting return code 0, always getting the expected output. I don't see how I can fix something on this problem because I cannot reproduce it.

It possibly can be an firewall issue from my country, and even mostly probable if file_versions.tsv is downloaded from BUSCO servers. I'll try to use vpn to rerun the test

Gequris commented 3 months ago

I do not understand why it works outside of BRAKER but not inside of BRAKER. Have you tried re-running the BRAKER jobs? Maybe you had temporary network problem when you ran it the first time? I have now run your command with your data on 3 different machines, always getting return code 0, always getting the expected output. I don't see how I can fix something on this problem because I cannot reproduce it.

Dr. Hoff, thank you for the help ! It was really russian firewall issue. I tried to connect to Island vpn and all tests worked as they supposed to!

joeyjoe0111 commented 2 months ago

I am encountering an issue similar to one previously reported as Gequris mentioned, where I am unable to successfully run BRAKER3 due to network restrictions. I am running BRAKER3 in an HPC environment using Singularity in China, where access to external servers (such as those hosting the BUSCO databases) is restricted. Is there a way to use a locally downloaded odb10 database for the --busco_lineage parameter instead of requiring external network access during runtime? If so, could you provide guidance on how to configure the local paths correctly in BRAKER3's parameters? Thank you in advance for your assistance and looking forward to your suggestions.

Gequris commented 2 months ago

I am encountering an issue similar to one previously reported as Gequris mentioned, where I am unable to successfully run BRAKER3 due to network restrictions. I am running BRAKER3 in an HPC environment using Singularity in China, where access to external servers (such as those hosting the BUSCO databases) is restricted. Is there a way to use a locally downloaded odb10 database for the --busco_lineage parameter instead of requiring external network access during runtime? If so, could you provide guidance on how to configure the local paths correctly in BRAKER3's parameters? Thank you in advance for your assistance and looking forward to your suggestions.

Hi ! I found out that if you store the unzipped mb_download folder with specific busco linage (in my case which was downloaded on local machine with VPN by braker run) in working directory of braker it works pretty well. Subsequent steps do not require any server restricted in Russia.

Also, if there is no any opportunity for you to run braker with VPN to download busco file I can do it by myself and share it with you. As I know genome and other databases and RNA-seq files dont influence the content of mb_downloads

joeyjoe0111 commented 2 months ago

Thanks for your suggestions! But It don't work. I think it's may the problem of path. The command as following: singularity exec -cleanenv --bind ${IN}:${IN},${OUT}:${OUT} \ /storage/0_Genome/Software/Braker3/braker3.sif \ braker. pl \ --AUGUSTUS_CONFIG_PATH=/storage/3.Braker3/config/ \ --species=Test1_busco \ --genome=$GENOME \ --prot_seq=$PROT \ --rnaseq_sets_ids=$SRR \ --rnaseq_sets_dirs=$SRR_Dir \ --threads=$THREADS \ -busco_lineage=insecta_odb10 \ --workingdir=/storage/0_Genome/Genome/3.Braker3/test1/braker

I have already downloaded the insecta_odb10 locally and store the folder of insecta_odb10 in the mb_download folder or the work dir which is braker. But it still reported same error. Thus, I want to know where should the the folder of insecta_odb10 locate? Thanks!

Gequris commented 2 months ago

joeyjoe0111

I also tried to download busco linage outside the box of pipeline and it didnt work either So I ran braker pipeline on local machine with VPN connection with minimum data to make braker fill the mb_download folder by desired lineage by itself. And only after this I transfered resulted from pipeline mb_downloads folder to HPC cluster

joeyjoe0111 commented 2 months ago

Thanks for your reply. You help me a lot! Becaues I cant ran braker pipeline on local machine. I have a require: Could you run insecta_odb10 and share it with me? I'm very sorry to trouble you! Anyway, Thanks! You really give me useful suggestions!

Gequris commented 2 months ago

Thanks for your reply. You help me a lot! Becaues I cant ran braker pipeline on local machine. I have a require: Could you run insecta_odb10 and share it with me? I'm very sorry to trouble you! Anyway, Thanks! You really give me useful suggestions!

https://drive.google.com/file/d/1mhqHQk_HDQsGLN29rlbKCK9CrGrjv_u-/view?usp=sharing Here you are, hope it will work for you You can decompress it, change the name to regular mb_downloads and paste to braker working dir

joeyjoe0111 commented 2 months ago

Cool!!! It works!!! Thank you very much!!! Words fail my appreciation!

jwasmuth commented 1 month ago

I've had the same problem (from Canada). The solution offer by @Gequris worked for me: $ mkdir test2; mkdir test2/mb_download; cp -r eukaryota_odb10 test2/mb_download where "eukaryota_odb10" was downloaded separately and unzipped. For test2.sh, you also need to comment out the lines which delete the test2 directory.