TobyBaril / EarlGrey

Earl Grey: A fully automated TE curation and annotation pipeline
Other
130 stars 19 forks source link

an error occurred, "Refining genome not found" #40

Closed lvqiang0120 closed 1 year ago

lvqiang0120 commented 1 year ago

Hi,Professor,thank you for developing this pileline for annotation of TEs. Recently, I used this pileline to annotate a genome of arthropod. My genome size was 2.7Gb. After it runned more than 27 hours, it stopped runed, the last 50 lines of log file are shown below:

RepeatScout/RECON discovery complete: 4091 families found

RepeatClassifier Version 2.0.4

Program Time: 27:15:15 (hh:mm:ss) Elapsed Time Working directory: /public/home/rp1016swf/02.EarlG/spe_EarlGrey/spe_RepeatModeler/RM_28569.SatApr80935442023 may be deleted unless there were problems with the run.

The results have been saved to: /public/home/rp1016swf/02.EarlG/spe_EarlGrey/spe_Database/spe-families.fa - Consensus sequences for each family identified. /public/home/rp1016swf/02.EarlG/spe_EarlGrey/spe_Database/spe-families.stk - Seed alignments for each family identified. /public/home/rp1016swf/02.EarlG/spe_EarlGrey/spe_Database/spe-rmod.log - Execution log. Useful for reproducing results.

The RepeatModeler stockholm file is formatted so that it can easily be submitted to the Dfam database. Please consider contributing curated families to this open database and be a part of this growing community resource. For more information contact help@dfam.org.

          )  (
         (   ) )
         ) ( (
       _______)_
    .-'---------|
   ( C|/\/\/\/\/|
    '-./\/\/\/\/|
     '_________'
      '-------'
    <<< Straining TEs and Refining de novo Consensus Sequences >>>

Refining genome not found Usage: [-l Repeat library] [-g Genome ] [-t Threads (default 4) ] [-f Flank (default 1000) ] [-r Numver of iterations of BEET to run (deafult 10)] [-d Out directory, if not specified wil be created ] [-h Print this help] [-M Ammount of memory TEstrainer needs to keep free] cp: cannot stat ‘/public/home/rp1016swf/02.EarlG/spe_EarlGrey/spe_strainer/TS*/spe-families.fa.strained’: No such file or directory ERROR: TEstrainer failed to produce a strain file, please check the log file for more information

I entered the director"/public/home/rp1016swf/02.EarlG/spe_EarlGrey/spe_strainer", it is an empty directory. However, it runned successfully and produced all result files when i tested it with a small genome file. I don't know what's wrong with its and how should i solve this problem. Looking forward to your reply.

manasealoo commented 1 year ago

Hi, I am also getting a similar error: " --cp: cannot stat ‘/mnt/lustre/users/maloo/scripts/Test_1/Test_run_EarlGrey/Test_run_strainer/TS*/Test_run-families.fa.strained’: No such file or directory ERROR: TEstrainer failed to produce a strain file, please check the log file for more information"

Did you find any solution?

TobyBaril commented 1 year ago

Hi,

There's a couple of things this could be, some of which were fixed in the latest release. Please could you upload the full log file, and the output directory structure with all present files and their sizes so I can look into this.

Thanks!

manasealoo commented 1 year ago

I've attached a link to the output folder of earlGrey, and all the logs for different runs that are of interest: earl.out, test_run.log, .corn.log, and lastly, microsporidia.log. For Microsporidia, it shows "permission denied" when copying files just after the initial trf run.

https://drive.google.com/drive/folders/1Q5t5elT5w7byziOsI_mAoOLrbg9uy0pJ?usp=share_link

Regards

On Thu, Apr 20, 2023 at 2:22 PM Tobias Baril @.***> wrote:

Hi,

There's a couple of things this could be, some of which were fixed in the latest release. Please could you upload the full log file, and the output directory structure with all present files and their sizes so I can look into this.

Thanks!

— Reply to this email directly, view it on GitHub https://github.com/TobyBaril/EarlGrey/issues/40#issuecomment-1516155506, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALBAD6HUMS5NMERBIGH52FTXCEL67ANCNFSM6AAAAAAWX5BGU4 . You are receiving this because you commented.Message ID: @.***>

TobyBaril commented 1 year ago

Hmmm, looks like something to do with the TEstrainer function... @jamesdgalbraith any insight on what this could be? I'll also investigate a little further.

The permission denied could be dependent on where and how you are running EarlGrey. If it is being submitted to a HPC queuing system, that process must have permission to read, write, and execute in the output directories, as well as wherever RepeatMasker is installed if you are running an initial RepeatMasker search.

If rerunning a previously started run, it might also help to delete everything inside the TEstrainer output directory so that the pipeline can also find the latest results. A better way of dealing with this is in the TODO list, but for the moment this is a workaround.

TobyBaril commented 1 year ago

@jamesdgalbraith has pushed a couple of fixes to TEstrainer for these issues. You will need to pull the latest repo and reconfigure earlGrey, but the issue should now be solved

manasealoo commented 1 year ago

Awesome, thanks!

On Thu, Apr 27, 2023 at 3:24 PM Tobias Baril @.***> wrote:

@jamesdgalbraith https://github.com/jamesdgalbraith has pushed a couple of fixes to TEstrainer for these issues. You will need to pull the latest repo and reconfigure earlGrey, but the issue should now be solved

— Reply to this email directly, view it on GitHub https://github.com/TobyBaril/EarlGrey/issues/40#issuecomment-1525600185, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALBAD6CIS72G7ZL55GACI2DXDJQQLANCNFSM6AAAAAAWX5BGU4 . You are receiving this because you commented.Message ID: @.***>