TobyBaril / EarlGrey

Earl Grey: A fully automated TE curation and annotation pipeline
Other
130 stars 19 forks source link

Summary plot fails because `bc` is missing (maybe) #118

Closed chriswyatt1 closed 1 week ago

chriswyatt1 commented 1 month ago

Hi,

Thanks for a great program, could you help me debug an issue I am having?

I am running from the docker container (tobybaril/earlgrey_dfam3.7:latest), and it almost completes, but dies with the following error:

Compare_te_error.txt

I think the main issue is /usr/local/bin/earlGrey: line 283: bc: command not found

So I tried to rebuild the docker container with bc, but I think I do not have the most recent Dockerfile. As it dies with the following, using an untouched dockerfile from https://github.com/TobyBaril/EarlGrey/blob/main/Docker/Dockerfile:

 => CACHED [stage-1  2/18] RUN apt update && apt-get -y install git                                                                 0.0s
 => CACHED [stage-1  3/18] RUN apt-get -y update     && apt-get -y install         aptitude         libgomp1         perl           0.0s
 => CANCELED [stage-1  4/18] RUN cd /opt/         && curl https://repo.anaconda.com/archive/Anaconda3-2022.10-Linux-x86_64.sh --o  29.6s
 => CACHED [builder  2/19] RUN apt-get -y update && apt-get -y install     curl gcc g++ make zlib1g-dev libgomp1     perl     pyth  0.0s
 => [builder  3/19] RUN apt-get -y update     && apt-get -y install         aptitude         libgomp1         perl         python  28.7s
 => ERROR [builder  4/19] COPY src/* /opt/src/ 

Have you come across this issue before? Maybe I am not using the program correctly. My command is:

earlGrey -g genome_tiny.fasta -s $species -o \${mydir}/${species}_earl_results -t ${task.cpus}

TobyBaril commented 1 month ago

Hi,

Thanks for checking out Earl Grey! The bc command is only used for calculating the runtime of the pipeline, and is only invoked following the completion of an Earl Grey run. This command fails in the Docker container because bc is not present, but this has no impact on the success of the pipeline. If you are missing output files, this might be because a different step has failed.

Which files are you missing? Would you be able to provide the earl grey log file and I'll happily take a look,

thanks!

chriswyatt1 commented 1 month ago

Hi, Thanks, ok thats good to know regarding bc. Maybe its because my dataset is a small test dataset for testing. It is just about a third of Drosophila yakuba's X chromosome

It is located here: s3://comparete/Drosophila_ChrX_Small.fa https://comparete.s3.amazonaws.com/Drosophila_ChrX_Small.fa

Here is my EarlGrey log: Drosophila_yakubaEarlGrey.log

TobyBaril commented 1 month ago

Is it just the landscapes that are missing? I think the issue is this line:

Starting calculations
WARNING. chromosome (NC_052526.2) was not found in the FASTA file. Skipping.
Finished calculations

This happened to me a couple of times but if I just reran the landscape section it worked just fine, so could be a python parsing issue, any ideas @jamesdgalbraith?

Beyond this, the rest of the summaries should have been generated successfully - can you see the annotation files, curated library, etc in the summaryFiles directory?

chriswyatt1 commented 1 month ago

I will try again next week, and let you know (Away this week). It was all run in a container, so I do not have the results folder to hand, but pretty sure most of the sumaryFiles were there.

TobyBaril commented 1 week ago

Closing due to lack of activity, feel free to reopen if needed.