lfaino / LoReAn

Long Reads Annotation pipeline
MIT License
71 stars 10 forks source link

errors during the run of Augustus #26

Closed michieitel closed 5 years ago

michieitel commented 5 years ago

Hi Luigi!

I ran the pipeline (options: --stranded --proteins --short_reads --adapter --mask_genome --max_intron_length 10000 ) and it created an error: Traceback (most recent call last):

File "/opt/LoReAn/code/lorean.py", line 560, in main() File "/opt/LoReAn/code/lorean.py", line 288, in main augustus_file, genemark_file = inputEvm.braker_folder_find(braker_folder) File "/opt/LoReAn/code/prepareEvmInputs.py", line 115, in braker_folder_find gff = [y for x in os.walk(location) for y in glob(os.path.join(x[0], "augustus.hints.gff"))][0] IndexError: list index out of range

Any suggestions?

Unfortunately, I cannot provide much more details of the run since all intermediate files (outputs of previous tools) were deleted with this crash...

Is it possible to include intermediates in future versions and restart from where the pipeline crashed?!

Thanks for your help Michael

lfaino commented 5 years ago

I think that I solved it. I need few days to make a new image.

michieitel commented 5 years ago

That's awesome! Thanks! Looking forward to try again... Michael

michieitel commented 5 years ago

Hi Luigi! Any updates on this? Best Michael

lfaino commented 5 years ago

@michieitel i made a new image. Can you test it?

cheers Luigi

michieitel commented 5 years ago

Hi Luigi!

Thanks for creating a new image. How can I check it is the latest?

I pulled with: singularity pull docker://lfaino/lorean:noIPRS

when running 'lorean -h' it says 2017 in the last line...!?

cheers Michael

lfaino commented 5 years ago

@michieitel please work with

singularity pull docker://lfaino/lorean:latest

cheers

michieitel commented 5 years ago

ah I see. thanks. will try and report

michieitel commented 5 years ago

I pulled using the link above... still says 2017 ;-)

But since I got the lorean_latest.sif I assume it should be the correct version...

lfaino commented 5 years ago

i guess so. I never noticed

michieitel commented 5 years ago

same error again:

Traceback (most recent call last): File "/opt/LoReAn/code/lorean.py", line 596, in main() File "/opt/LoReAn/code/lorean.py", line 320, in main augustus_file, genemark_file = inputEvm.braker_folder_find(braker_folder) File "/opt/LoReAn/code/prepareEvmInputs.py", line 115, in braker_folder_find gff = [y for x in os.walk(location) for y in glob(os.path.join(x[0], "augustus.hints.gff"))][0] IndexError: list index out of range

PLUS all intermediated removed automatically again...

lfaino commented 5 years ago

@michieitel can you send me the singularity command? thanks

michieitel commented 5 years ago
singularity exec \
-B /home/ubuntu/cbas/lorean/config/:/opt/LoReAn/third_party/software/augustus/config/ \
-B /home/ubuntu/cbas/lorean/Libraries/:/usr/local/RepeatMasker/Libraries/ \
/home/ubuntu/cbas/lorean/lorean_latest.sif \
lorean -t 20 -sp cbas_masurca2 \
--stranded \
--proteins /home/ubuntu/cbas/lorean/data/PORI_Demo_Amphimedon_queenslandica_v2.1__P__FERNANDEZ-VALVERDE.fasta \
--short_reads /home/ubuntu/cbas/lorean/data/CBAS_Concatenated_RNAseq_Read1_Clean_Datasets.fastq,/home/ubuntu/cbas/lorean/data/CBAS_Concatenated_RNAseq_Read2_Clean_Datasets.fastq \
--long_reads /home/ubuntu/cbas/lorean/data/cbas_cDNA_polyA-guppy-3.1.5-hac.porechop_100bp_to_20kb_combined.fastq \
--adapter /home/ubuntu/cbas/lorean/data/TruSeq3-PE-2.fa \
--mask_genome \
--working_dir lorean_run1 \
--max_intron_length 40000 \
/home/ubuntu/cbas/lorean/data/CBAS_MASURCA-2_final.genome.scf.fasta 2> cbas_lorean_run1.log
lfaino commented 5 years ago

we can try adding add --keep_tmp to lorean options. The folder will not disappear. do you have the geneMark key in the home folder of the user that run LoReAn?

michieitel commented 5 years ago

ubuntu@lorean:~/cbas/lorean$ cat ~/.gm_key TTGTTCAATTAGCACGGATGTTTTTTTTTTTTTTTTCCGTCGCCATAAAGTTACTAACAGAATTCAAAAGGGAGCGCATA 520951310

michieitel commented 5 years ago

If it helps to keep inmtermediates to find the error I can run again with the suggested option ...

lfaino commented 5 years ago

ok I will test the image again to see if something is wrong.

michieitel commented 5 years ago

should I run it on a dummy set?

michieitel commented 5 years ago

can you include a test dataset in the repo that we can both run?

lfaino commented 5 years ago

it is there.

https://github.com/lfaino/LoReAn_Example

michieitel commented 5 years ago

oh there is one. I will try using that one!

michieitel commented 5 years ago

not sure if I can ask here, but is it too difficult to implement an option that allows to pass user-defined options for the gmap step of long reads.... I am asking since I have figured out the settings that worked best for my set of nanopore reads.

lfaino commented 5 years ago

which setting do you mean? can you tell me the option?

lfaino commented 5 years ago

on the toy dataset all works fine... is /home/ubuntu/cbas/lorean/config/ folder writing accessible?

michieitel commented 5 years ago

for gmap settings I am using: -k 15 -B 4 --cross-species -A --exons=cdna --format=samse --npaths=0 --sam-extended-cigar

not sure which of these you included...

michieitel commented 5 years ago

home/ubuntu/cbas/lorean/config/ is accessible

michieitel commented 5 years ago

I got the same error for both example datasets ...

michieitel commented 5 years ago

example 1: Plicaturopsis

singularity exec \
-B /home/ubuntu/cbas/lorean/config/:/opt/LoReAn/third_party/software/augustus/config/ \
-B /home/ubuntu/cbas/lorean/Libraries/:/usr/local/RepeatMasker/Libraries/ \
/home/ubuntu/cbas/lorean/lorean_latest.sif \
lorean -a -d -f -mg -t 20 --keep_tmp -rp repeats.scaffold3.bed \
-sr /home/ubuntu/cbas/lorean/LoReAn_Example/Crispa/scaffold3.short_1.fastq,/home/ubuntu/cbas/lorean/LoReAn_Example/Crispa/scaffold3.short_2.fastq \
-lr /home/ubuntu/cbas/lorean/LoReAn_Example/Crispa/scaffold3.long.fasta \
-pr /home/ubuntu/cbas/lorean/LoReAn_Example/Crispa/scaffold3.prot.fasta \
-sp crispa \
--working_dir Plicaturopsis \
/home/ubuntu/cbas/lorean/LoReAn_Example/Crispa/scaffold3.fasta

example 2: Verticillium

singularity exec \
-B /home/ubuntu/cbas/lorean/config/:/opt/LoReAn/third_party/software/augustus/config/ \
-B /home/ubuntu/cbas/lorean/Libraries/:/usr/local/RepeatMasker/Libraries/ \
/home/ubuntu/cbas/lorean/lorean_latest.sif \
lorean -t 20 --keep_tmp -a -f -d \
-sr /home/ubuntu/cbas/lorean/LoReAn_Example/JR2/readsChr.subset.fastq \
-lr /home/ubuntu/cbas/lorean/LoReAn_Example/JR2/longReadsChr8.fastq \
-pr /home/ubuntu/cbas/lorean/LoReAn_Example/JR2/subset.prot.fasta \
-sp JR2 \
--working_dir Verticillium \
-mg /home/ubuntu/cbas/lorean/LoReAn_Example/JR2/chr8.fasta
lfaino commented 5 years ago

In the beaker folder in run folder, you should see a genemark folder.

Can you see any error or log file?

Can you check if you get any errors?

michieitel commented 5 years ago

what I did not understand for the examples was you specify adaptors with '-a' but then don't pürovide a file? what is the default?

lfaino commented 5 years ago

There is a module that looks for them

michieitel commented 5 years ago

so I don't have to specify the adaptor file? it is included in the image?

michieitel commented 5 years ago

how can I access the braker folder of the image?

lfaino commented 5 years ago

It is better if you specify but without it will work.

Can you find out if genemark worked in the braker folder?

michieitel commented 5 years ago

how can I access the braker folder of the image?

lfaino commented 5 years ago

It is not in the image. In the folder where you run there is a LoReAn folder. Can you see it?

Ottieni Outlook per Androidhttps://aka.ms/ghei36


From: michieitel notifications@github.com Sent: Tuesday, July 30, 2019 8:45:56 PM To: lfaino/LoReAn LoReAn@noreply.github.com Cc: Luigi Faino luigi.faino@uniroma1.it; Comment comment@noreply.github.com Subject: Re: [lfaino/LoReAn] errors during the run of Augustus (#26)

how can I access the braker folder of the image?

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/lfaino/LoReAn/issues/26?email_source=notifications&email_token=AA45GFDS7HKK2P7Z4DNCFMDQCCD6JA5CNFSM4IFXBYI2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3E5WJQ#issuecomment-516545318, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AA45GFBZO2GN4ZFK6I7ZJ73QCCD6JANCNFSM4IFXBYIQ.

--


Il tuo 5 diventa 1000 Fai crescere la tua università Dona il 5 per mille alla Sapienza Codice fiscale: 80209930587

https://www.uniroma1.it/it/pagina/fai-crescere-la-tua-universita-con-il-cinque-mille https://www.uniroma1.it/it/pagina/fai-crescere-la-tua-universita-con-il-cinque-mille

michieitel commented 5 years ago

error log for example 1:

Use of uninitialized value $epath in concatenation (.) or string at /opt/LoReAn/third_party/software/BRAKER/scripts//braker.pl line 2370. ERROR in file /opt/LoReAn/third_party/software/BRAKER/scripts//braker.pl at line 5616 Failed to execute: perl /opt/LoReAn/third_party/software/gm_et_linux_64/gmes_petap//gmes_petap.pl --verbose --sequence=/home/ubuntu/cbas/lorean/LoReAn_Plicaturopsis/run/braker/genome.fa --ET=/home/ubuntu/cbas/lorean/LoReAn_Plicaturopsis/run/braker/genemark_hintsfile.gff --et_score 10 --max_intergenic 50000 --cores=9 --fungus 1>/home/ubuntu/cbas/lorean/LoReAn_Plicaturopsis/run/braker/GeneMark-ET.stdout 2>/home/ubuntu/cbas/lorean/LoReAn_Plicaturopsis/run/braker/errors/GeneMark-ET.stderr

looks like genemark cannot be fired up? key-related?

lfaino commented 5 years ago

Can you check the .error file of genemark?

What is inside?

michieitel commented 5 years ago

no .err in GenMark folder. content:

drwxr-xr-x 6 ubuntu ubuntu 4.0K Jul 30 17:55 GeneMark-ET -rw-r--r-- 1 ubuntu ubuntu 1.5K Jul 30 17:55 GeneMark-ET.stdout -rw-r--r-- 1 ubuntu ubuntu 10 Jul 30 17:54 bam_header.map -rw-r--r-- 1 ubuntu ubuntu 714 Jul 30 17:55 braker.error.log -rw-r--r-- 1 ubuntu ubuntu 9.0K Jul 30 17:54 braker.log drwxr-xr-x 2 ubuntu ubuntu 4.0K Jul 30 17:54 errors -rw-r--r-- 1 ubuntu ubuntu 436K Jul 30 17:54 genemark_hintsfile.gff -rw-r--r-- 1 ubuntu ubuntu 2.4M Jul 30 17:54 genome.fa -rw-r--r-- 1 ubuntu ubuntu 10 Jul 30 17:54 genome_header.map -rw-r--r-- 1 ubuntu ubuntu 436K Jul 30 17:54 hintsfile.gff drwxr-xr-x 2 ubuntu ubuntu 4.0K Jul 30 17:54 species

lfaino commented 5 years ago

This

/home/ubuntu/cbas/lorean/LoReAn_Plicaturopsis/run/braker/errors/GeneMark-ET.stderr

michieitel commented 5 years ago

GeneMark.hmm eukaryotic 3 GeneMark.hmm eukaryotic 3 Your license period has ended. We hope that you found this Your license period has ended. We hope that you found this software useful. If you would like to renew this license, software useful. If you would like to renew this license, please contact GeneProbe, Inc. at custserv@genepro.com please contact GeneProbe, Inc. at custserv@genepro.com

(in cleanup) Can't call method "FETCH" on an undefined value at /usr/local/share/perl/5.22.1/Object/InsideOut.pm line 1953 during global destruction. (in cleanup) Can't call method "FETCH" on an undefined value at /usr/local/share/perl/5.22.1/Object/InsideOut.pm line 1953 during global destruction. (in cleanup) Can't call method "FETCH" on an undefined value at /usr/local/share/perl/5.22.1/Object/InsideOut.pm line 1953 during global destruction.

lfaino commented 5 years ago

The key of genemark is expired. You need a new one

michieitel commented 5 years ago

I feel stupid now... let me get it and run again. many thanks for now

michieitel commented 5 years ago

Hi Luigi,

the pipeline finished. It was that error. Now playing with weighings of the datasets.

Many Thanks for your help! Michael