HajkD / LTRpred

De novo annotation of young retrotransposons
https://hajkd.github.io/LTRpred/
GNU General Public License v2.0
46 stars 8 forks source link

Error: Column `cn_3ltr` must be length 48 (the number of rows) or one, not 0 #27

Open MarquisMo opened 2 years ago

MarquisMo commented 2 years ago

Dear Dr. Hajk-Georg Drost Recently, I used one of your programs named "LTRpred". I keep getting this error in the Join solo LTR Copy Number Estimation table after Finished LTR CNV estimation with the parameter of "copy.number.est = TRUE".


> LTRpred(genome.file = "moso10w.fasta", cores = 16,cluster=TRUE,copy.number.est = TRUE)
vsearch v2.14.2_linux_x86_64, 125.7GB RAM, 32 cores
https://github.com/torognes/vsearch

Running LTRpred on genome 'moso10w.fasta' with 16 core(s) and searching for retrotransposons using the overlaps option (overlaps = 'no') ...

No hmm files were specified, thus the internal HMM library will be used! See '/usr/local/lib/R/site-library/LTRpred/HMMs/hmm_*' for details.
No tRNA files were specified, thus the internal tRNA library will be used! See '/usr/local/lib/R/site-library/LTRpred/tRNAs/tRNA_library.fa' for details.
The output folder '/home/rstudio/ltrpred_data/moso10w_ltrpred' does not seem to exist yet and will be created ...

LTRpred - Step 1:
Run LTRharvest...
LTRharvest: Generating index file moso10w_ltrharvest/moso10w_index.fsa with gt suffixerator...
Running LTRharvest and writing results to moso10w_ltrharvest...
LTRharvest analysis finished!

LTRpred - Step 2:
Run LTRdigest...
Generating index file moso10w_ltrdigest/moso10w_index_ltrdigest.fsa with suffixerator...
LTRdigest: Sort index file...
Running LTRdigest and writing results to moso10w_ltrdigest...
LTRdigest analysis finished!

LTRpred - Step 3:
Import LTRdigest Predictions...

Input:  moso10w_ltrdigest/moso10w_LTRdigestPrediction.gff  -> Row Number:  547
Remove 'NA' -> New Row Number:  547
(1/8) Filtering for repeat regions has been finished.
(2/8) Filtering for LTR retrotransposons has been finished.
(3/8) Filtering for inverted repeats has been finished.
(4/8) Filtering for LTRs has been finished.
(5/8) Filtering for target site duplication has been finished.
(6/8) Filtering for primer binding site has been finished.
(7/8) Filtering for protein match has been finished.
(8/8) Filtering for RR tract has been finished.

LTRpred - Step 4:
Perform ORF Prediction using 'usearch -fastx_findorfs' ...
usearch v11.0.667_i86linux32, 4.0Gb RAM (132Gb total), 32 cores
(C) Copyright 2013-18 Robert C. Edgar, all rights reserved.
https://drive5.com/usearch

License: personal use only

00:00 45Mb    100.0% Working

WARNING: Input has lower-case masked sequences

Join ORF Prediction table: nrow(df) = 66 candidates.
unique(ID) = 66 candidates.
unique(orf.id) = 66 candidates.
Perform clustering of similar LTR transposons using 'vsearch --cluster_fast' ...
vsearch v2.14.2_linux_x86_64, 125.7GB RAM, 32 cores
https://github.com/torognes/vsearch

Running CLUSTpred with 90% as sequence similarity threshold using 16 cores ...
vsearch v2.14.2_linux_x86_64, 125.7GB RAM, 32 cores
https://github.com/torognes/vsearch

Reading file /home/rstudio/ltrpred_data/moso10w_ltrdigest/moso10w-ltrdigest_complete.fas 100%  
622022 nt in 66 seqs, min 4330, max 22210, avg 9425
Sorting by length 100%
Counting k-mers 100% 
Clustering 100%  
Sorting clusters 100%
Writing clusters 100% 
Clusters: 61 Size min 1, max 3, avg 1.1
Singletons: 58, 87.9% of seqs, 95.1% of clusters
Sorting clusters by abundance 100%
CLUSTpred output has been stored in: /home/rstudio/ltrpred_data/moso10w_ltrpred
Join Cluster table: nrow(df) = 66 candidates.
unique(ID) = 66 candidates.
unique(orf.id) = 66 candidates.
Join Cluster Copy Number table: nrow(df) = 66 candidates.
unique(ID) = 66 candidates.
unique(orf.id)) = 66 candidates.

LTRpred - Step 5:
Perform methylation context quantification..
Join methylation context (CG, CHG, CHH, CCG) count table: nrow(df) = 66 candidates.
unique(ID) = 66 candidates.
unique(orf.id) = 66 candidates.
Copy files to result folder '/home/rstudio/ltrpred_data/moso10w_ltrpred'.

LTRpred - Step 6:
Starting retrotransposon evolutionary age estimation by comparing the 3' and 5' LTRs using the molecular evolution model 'K80' and the mutation rate '1.3e-07' (please make sure the mutation rate can be assumed for your species of interest!) for 66 predicted elements ...

Please be aware that evolutionary age estimation based on 3' and 5' LTR comparisons are only very rough time estimates and don't take reverse-transcription mediated retrotransposon recombination between family members of retroelements into account! Please consult Sanchez et al., 2017 Nature Communications and Drost & Sanchez, 2019 Genome Biology and Evolution for more details on retrotransposon recombination.

LTRpred - Step 7:
The LTRpred prediction table has been filtered (default) to remove potential false positives. Predicted LTRs must have an PBS or Protein Domain and must fulfill thresholds: sim = 70%; #orfs = 0. Furthermore, TEs having more than 10% of N's in their sequence have also been removed.
Input #TEs: 66
Output #TEs: 48
Perform solo LTR Copy Number Estimation....
Run makeblastdb of the genome assembly...

Building a new DB, current time: 10/28/2022 08:06:19
New DB name:   /home/rstudio/ltrpred_data/moso10w.fasta
New DB title:  moso10w.fasta
Sequence type: Nucleotide
Deleted existing Nucleotide BLAST database named /home/rstudio/ltrpred_data/moso10w.fasta
Keep MBits: T
Maximum file size: 1000000000B
Adding sequences from FASTA; added 1 sequences in 0.072499 seconds.
Perform BLAST searches of 3' prime LTRs against genome assembly...
Perform BLAST searches of 5' prime LTRs against genome assembly...
Import BLAST results...
Filter hit results...
Estimate CNV for each LTR sequence...
Finished LTR CNV estimation!
Join solo LTR Copy Number Estimation table: nrow(df) = 48 candidates.
unique(ID) = 48 candidates.
unique(orf.id) = 48 candidates.
Error: Column `cn_3ltr` must be length 48 (the number of rows) or one, not 0
In addition: Warning message:
`data_frame()` is deprecated as of tibble 1.1.0.
Please use `tibble()` instead.
This warning is displayed once every 8 hours.
Call `lifecycle::last_warnings()` to see where this warning was generated. 
> 
schraderL commented 8 months ago

Hi! I get the exact same error. Has this ever been resolved? Best Lukas