Closed Ifengel closed 10 months ago
Dear @Ifengel, the bed file doesn't follow the correct order of columns. Probably the gtf file for TEs that you provided is not well formatted.
The right conversion from .out (output from RepeatMasker) to .gtf can be done with the following command line:
tail -n +4 RMfile.out | egrep -v 'Satellite|Simple_repeat|rRNA|Low_complexity|RNA|ARTEFACT' | awk -v OFS='\t' '{Sense=$9;sub(/C/,"-",Sense);$9=Sense;print $5,"RepeatMasker","similarity",$6,$7,$2,$9,".",$10}' > RMfile.gtf
You can try to use the RMfile.gtf
as input to fix the error.
If the error is still happening, could you send me the first lines of your gtf for TEs?
Dear Daniel,
Thank you very much for your answer. It works!
My problems were two:
First, I downloaded a .out file from UCSC Genome Browser and this was a bad idea. When I downloaded the file directly from RepeatMasker, it worked.
In addition, I didn't use your command to obtain the gtf file.
Thanks!
Hello,
I got the following error:
The command was:
The error message was: Error: unable to open file or unable to determine types for file /mnt/nvme0n1p1/ifengel/ChimeraTE/ChimeraTE/projects/prueba_quimera/tmp/TE_file.bed
My .bed file has the following format (which is created by the pipeline itself):
8 chr1 8386825 - . -187082416 52 chr1 16776988 + . -178692920 91 chr1 33554408 - . -161917331 0 chr1 50329971 + . -145136573 0 chr1 83885790 - . -111585613 0 chr1 109051332 + . -86419645 15 chr1 125828927 + . -69642495 11 chr1 167772060 - . -27699727 0 chr1 184549326 + . -10922519 49 chr1 3145673 - . -192326175
I don't know how I can to fix this error.