Closed Zoey1001 closed 2 years ago
Hi,
The first error is something that gets raised in RepeatCraft, but has no effect (confirmed by the author of RepeatCraft), so can be ignored.
The following ones look to be due to the version of R you have installed being old: package ‘rlang’ was installed before R 4.0.0: please re-install it
.
Try installing the latest version of R when the earlGrey environment is active:
conda install -c conda-forge r-base
conda install -c conda-forge r-rlang
and this should hopefully fix the issue!
OK,[Thank you for your reply and I have fix the problem by re-install R,and i got the results:
Step 6: Writing stat file..Removing tmp files...
Done
Traceback (most recent call last):
File "/home/dell/EarlGrey/scripts/repeatCraft/repeatcraft.py", line 187, in
\ ^__^
\ (oo)\_______
(__)\ )\/
||----w |
|| ||
Loading required package: stats4 Loading required package: BiocGenerics
Attaching package: ‘BiocGenerics’
The following objects are masked from ‘package:stats’:
IQR, mad, sd, var, xtabs
The following objects are masked from ‘package:base’:
anyDuplicated, append, as.data.frame, basename, cbind, colnames,
dirname, do.call, duplicated, eval, evalq, Filter, Find, get, grep,
grepl, intersect, is.unsorted, lapply, Map, mapply, match, mget,
order, paste, pmax, pmax.int, pmin, pmin.int, Position, rank,
rbind, Reduce, rownames, sapply, setdiff, sort, table, tapply,
union, unique, unsplit, which.max, which.min
Loading required package: S4Vectors
Attaching package: ‘S4Vectors’
The following objects are masked from ‘package:base’:
expand.grid, I, unname
Attaching package: ‘plyr’
The following objects are masked from ‘package:dplyr’:
arrange, count, desc, failwith, id, mutate, rename, summarise,
summarize
The following object is masked from ‘package:purrr’:
compact
Attaching package: ‘magrittr’
The following object is masked from ‘package:purrr’:
set_names
The following object is masked from ‘package:tidyr’:
extract
Attaching package: ‘data.table’
The following objects are masked from ‘package:dplyr’:
between, first, last
The following object is masked from ‘package:purrr’:
transpose
[1] "/home/dell/miniconda3/envs/earlGrey/lib/R/bin/exec/R"
[2] "--no-echo"
[3] "--no-restore"
[4] "--file=/home/dell/EarlGrey/scripts/mergeRepeats.R"
[5] "--args"
[6] "/home/dell/1k_spider/zq/PL/repeat/P_lEarlGrey/P_l_mergedRepeats/looseMerge/P_l.rmerge.gff.filtered"
[7] "/home/dell/1k_spider/zq/PL/repeat/P_lEarlGrey/P_l_mergedRepeats/looseMerge/P_l.mergedRepeats.bed"
[8] "2592532507"
[9] "/home/dell/1k_spider/zq/PL/repeat/P_lEarlGrey/P_l_mergedRepeats/looseMerge/P_l.mergedRepeats.revisedTable"
[10] "/home/dell/1k_spider/zq/PL/repeat/P_lEarlGrey/P_l_mergedRepeats/looseMerge/P_l.filteredRepeats.bed"
[11] "/home/dell/1k_spider/zq/PL/repeat/P_lEarlGrey/P_l_mergedRepeats/looseMerge/P_l.filteredRepeats.summary"
── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──
✔ ggplot2 3.3.5 ✔ purrr 0.3.4
✔ tibble 3.1.5 ✔ dplyr 1.0.7
✔ tidyr 1.1.4 ✔ stringr 1.4.0
✔ readr 2.0.2 ✔ forcats 0.5.1
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
[1] "/home/dell/miniconda3/envs/earlGrey/lib/R/bin/exec/R"
[2] "--no-echo"
[3] "--no-restore"
[4] "--file=/home/dell/EarlGrey/scripts/makeGff.R"
[5] "--args"
[6] "/home/dell/1k_spider/zq/PL/repeat/P_lEarlGrey/P_l_mergedRepeats/looseMerge/P_l.filteredRepeats.bed"
[7] "/home/dell/1k_spider/zq/PL/repeat/P_lEarlGrey/P_l_mergedRepeats/looseMerge/P_l.rmerge.gff.filtered"
[8] "/home/dell/1k_spider/zq/PL/repeat/P_lEarlGrey/P_l_mergedRepeats/looseMerge/P_l.filteredRepeats.gff"
\ ^__^
\ (oo)\_______
(__)\ )\/
||----w |
|| ||
\ ^__^
\ (oo)\_______
(__)\ )\/
||----w |
|| ||
── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ── ✔ ggplot2 3.3.5 ✔ purrr 0.3.4 ✔ tibble 3.1.5 ✔ dplyr 1.0.7 ✔ tidyr 1.1.4 ✔ stringr 1.4.0 ✔ readr 2.0.2 ✔ forcats 0.5.1 ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ── ✖ dplyr::filter() masks stats::filter() ✖ dplyr::lag() masks stats::lag()
Attaching package: ‘data.table’
The following objects are masked from ‘package:dplyr’:
between, first, last
The following object is masked from ‘package:purrr’:
transpose
[1] "/home/dell/miniconda3/envs/earlGrey/lib/R/bin/exec/R"
[2] "--no-echo"
[3] "--no-restore"
[4] "--file=/home/dell/EarlGrey/scripts/autoPie.R"
[5] "--args"
[6] "/home/dell/1k_spider/zq/PL/repeat/P_lEarlGrey/P_l_mergedRepeats/looseMerge/P_l.filteredRepeats.bed"
[7] "/home/dell/1k_spider/zq/PL/repeat/P_lEarlGrey/P_l_mergedRepeats/looseMerge/P_l.filteredRepeats.gff"
[8] "2592532507"
[9] "/home/dell/1k_spider/zq/PL/repeat/P_lEarlGrey/P_l_summaryFiles/P_l.summaryPie.pdf"
[10] "/home/dell/1k_spider/zq/PL/repeat/P_lEarlGrey/P_l_summaryFiles/P_l.highLevelCount.txt"
── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──
✔ ggplot2 3.3.5 ✔ purrr 0.3.4
✔ tibble 3.1.5 ✔ dplyr 1.0.7
✔ tidyr 1.1.4 ✔ stringr 1.4.0
✔ readr 2.0.2 ✔ forcats 0.5.1
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
Attaching package: ‘data.table’
The following objects are masked from ‘package:dplyr’:
between, first, last
The following object is masked from ‘package:purrr’:
transpose
Attaching package: ‘magrittr’
The following object is masked from ‘package:purrr’:
set_names
The following object is masked from ‘package:tidyr’:
extract
[1] "/home/dell/miniconda3/envs/earlGrey/lib/R/bin/exec/R"
[2] "--no-echo"
[3] "--no-restore"
[4] "--file=/home/dell/EarlGrey/scripts/autoLand.R"
[5] "--args"
[6] "/home/dell/1k_spider/zq/PL/repeat/P_lEarlGrey/P_l_RepeatLandscape/P_l.divsum"
[7] "2592532507"
[8] "P_l"
[9] "/home/dell/1k_spider/zq/PL/repeat/P_lEarlGrey/P_l_summaryFiles/P_l.repeatLandscape.pdf"
summarise()
has grouped output by 'species', 'classif'. You can override using the .groups
argument.
Completed: 100
Completed: 200
Completed: 300
Completed: 400
Completed: 500
Completed: 600
summarise()
has grouped output by 'species', 'Divergence'. You can override using the .groups
argument.
\ ^__^
\ (oo)\_______
(__)\ )\/
||----w |
|| ||
\ ^__^
\ (oo)\_______
(__)\ )\/
||----w |
|| ||
\ ^__^
\ (oo)\_______
(__)\ )\/
||----w |
|| ||
\ ^__^
\ (oo)\_______
(__)\ )\/
||----w |
|| ||
but,i check the result ,i found there are so much " Unclassified",coud you please tell me how to fix these. tclassif cov count proportion gen Number_of_Distinct_Classifications DNA 381289164 646989 0.147072086066615 2592532507 7536 LINE 114698753 169894 0.0442419729319907 2592532507 6933 LTR 104741925 89611 0.0404013931232068 2592532507 6829 Other (Simple Repeat, Microsatellite, RNA) 1386767 1317 0.000534908239821735 2592532507 396 Penelope 4598716 7150 0.0017738315672352 2592532507 1461 Rolling Circle 26069630 40011 0.0100556617630099 2592532507 3091 SINE 4759158 14464 0.00183571777293051 2592532507 1263 Unclassified 618695287 1101120 0.238645141509117 2592532507 8088
There isn't necessarily a way to "fix" unclassfied TEs, as that is exactly what they are - they are unable to be classified by automated methods currently.
Depending on which TE libraries you specific with the -r flag, there may be many unclassified TE families. The TE landscape of a single species can be very different from other even closely related species, and a lack of sampling for closely related species can mean that many TEs cannot be classified, but are identified as TEs by de novo methods, like RepeatModeler employed here.
A good example of this is in the Earl Grey preprint. When we only used RepeatMasker with a library of known TEs for Coleoptera, M. bipustulatus appeared to have a TE content ~3% and of mostly known repeats. However when we used de novo methods, we actually estimate TE content to be between 31.86% and 56.57%, of which many are unclassified.
Overall, depending on the species you are looking at, and the availability of sequences for related species, you may find a high number of unclassified TEs. These still have the potential to be real TEs that we don't currently have an understanding of, or they are not similar enough to known TEs to be classified. Some of these TE families can probably be classified using manual annotation, and as sampling of TEs in more diverse genomes increases, the level of unclassified families should decrease.
Okay, I got it. Thank you!
Progress:3571426/3571437... Progress:3571427/3571437... Progress:3571428/3571437... Progress:3571429/3571437... Progress:3571430/3571437... Progress:3571431/3571437... Progress:3571432/3571437... Progress:3571433/3571437... Progress:3571434/3571437... Progress:3571435/3571437... Progress:3571436/3571437...Step 5: Merging GFF records by labels... Step 6: Writing stat file..Removing tmp files... Done Traceback (most recent call last): File "/home/dell/EarlGrey/scripts/repeatCraft/repeatcraft.py", line 187, in
rcStatm.rcstat(rclabelp=outputnamelabel,rmergep=outputnamemerge,outfile= statfname, ltrgroup = True)
File "/home/dell/EarlGrey/scripts/repeatCraft/helper/rcStatm.py", line 54, in rcstat
if rowRaw.get(col[2]):
IndexError: list index out of range
< Resolving Overlapping Repeats >
Loading required package: stats4 Loading required package: BiocGenerics
Attaching package: ‘BiocGenerics’
The following objects are masked from ‘package:stats’:
The following objects are masked from ‘package:base’:
Loading required package: S4Vectors
Attaching package: ‘S4Vectors’
The following objects are masked from ‘package:base’:
Loading required package: IRanges Loading required package: GenomeInfoDb Warning messages: 1: package ‘GenomicRanges’ was built under R version 4.1.2 2: package ‘S4Vectors’ was built under R version 4.1.2 3: package ‘IRanges’ was built under R version 4.1.2 Warning message: package ‘ape’ was built under R version 4.1.2 [1] "/home/dell/miniconda3/envs/earlGrey/lib/R/bin/exec/R"
table = pd.read_csv(input, names = ['scaf', 'start', 'end', 'repeat', 'score', 'strand'], delim_whitespace = True, header = None)
File "/home/dell/miniconda3/envs/earlGrey/lib/python3.6/site-packages/pandas/io/parsers.py", line 688, in read_csv
return _read(filepath_or_buffer, kwds)
File "/home/dell/miniconda3/envs/earlGrey/lib/python3.6/site-packages/pandas/io/parsers.py", line 454, in _read
parser = TextFileReader(fp_or_buf, kwds)
File "/home/dell/miniconda3/envs/earlGrey/lib/python3.6/site-packages/pandas/io/parsers.py", line 948, in init
self._make_engine(self.engine)
File "/home/dell/miniconda3/envs/earlGrey/lib/python3.6/site-packages/pandas/io/parsers.py", line 1180, in _make_engine
self._engine = CParserWrapper(self.f, self.options)
File "/home/dell/miniconda3/envs/earlGrey/lib/python3.6/site-packages/pandas/io/parsers.py", line 2010, in init
self._reader = parsers.TextReader(src, **kwds)
File "pandas/_libs/parsers.pyx", line 382, in pandas._libs.parsers.TextReader.cinit
File "pandas/_libs/parsers.pyx", line 674, in pandas._libs.parsers.TextReader._setup_parser_source
FileNotFoundError: [Errno 2] No such file or directory: '/home/dell/1k_spider/zq/PL/repeat/P_lEarlGrey/P_l_mergedRepeats/looseMerge/P_l.filteredRepeats.bed'
mv: cannot stat '/home/dell/1k_spider/zq/PL/repeat/P_lEarlGrey/P_l_mergedRepeats/looseMerge/P_l.filteredRepeats.bed.2': No such file or directory
Error: package or namespace load failed for ‘tidyverse’:
package ‘rlang’ was installed before R 4.0.0: please re-install it
Execution halted
[2] "--no-echo"
[3] "--no-restore"
[4] "--file=/home/dell/EarlGrey/scripts/filteringOverlappingRepeats.R"
[5] "--args"
[6] "/home/dell/1k_spider/zq/PL/repeat/P_lEarlGrey/P_l_mergedRepeats/looseMerge/P_l.rmerge.gff.sorted"
[7] "/home/dell/1k_spider/zq/PL/repeat/P_lEarlGrey/P_l_mergedRepeats/looseMerge/P_l.rmerge.gff.filtered" Error: package or namespace load failed for ‘tidyverse’: package ‘rlang’ was installed before R 4.0.0: please re-install it Execution halted cp: cannot stat '/home/dell/1k_spider/zq/PL/repeat/P_lEarlGrey/P_l_mergedRepeats/looseMerge/P_l.filteredRepeats.bed': No such file or directory Traceback (most recent call last): File "/home/dell/EarlGrey/scripts/backSwap.py", line 14, in
< Done! >
< Generating Summary Plots >
Error: package or namespace load failed for ‘tidyverse’: package ‘rlang’ was installed before R 4.0.0: please re-install it Execution halted Error: package or namespace load failed for ‘tidyverse’: package ‘rlang’ was installed before R 4.0.0: please re-install it Execution halted
< Identifying TE Clusters and Member Sequences >
Error: Unable to open file /home/dell/1k_spider/zq/PL/repeat/P_lEarlGrey/P_l_mergedRepeats/looseMerge/P_l.filteredRepeats.bed. Exiting. Error: The requested file (/home/dell/1k_spider/zq/PL/repeat/P_lEarlGrey/P_l_mergedRepeats/looseMerge/P_l.filteredRepeats.bed) could not be opened. Error message: (No such file or directory). Exiting!
< Tidying Directories and Organising Important Files >
cp: cannot stat '/home/dell/1k_spider/zq/PL/repeat/P_lEarlGrey/P_l_mergedRepeats/looseMerge/P_l.filteredRepeats.bed': No such file or directory cp: cannot stat '/home/dell/1k_spider/zq/PL/repeat/P_lEarlGrey/P_l_mergedRepeats/looseMerge/P_l.filteredRepeats.gff': No such file or directory
< Done in 01:05:05.00 >
< De Novo Library, Combined Library, Summary Figures, and TE Quantifications in Standard Formats Can Be Found in /home/dell/1k_spider/zq/PL/repeat/P_lEarlGrey/P_l_summaryFiles/ >
Could you please tell me how to solve it? thank you!