egluckthaler / starfish

starfish: a modular toolkit for giant mobile element annotation
GNU Affero General Public License v3.0
26 stars 5 forks source link

Starfish failing to identify known starships with verified insertion site polymorphisms #13

Closed drdna closed 2 months ago

drdna commented 3 months ago

I have now analyzed two genomes that we know to contain 8 full-length starship elements. Starfish failed to identify any of them. Maybe this is due to the aforementioned failure in the bedtools intersect command:

[Tue Aug 6 08:33:45 2024] checking formatting of GFFs in ome2gff.txt.. sh: -c: line 0: syntax error near unexpected token (' sh: -c: line 0:bedtools intersect -a Arcadia_SF_starships/Arcadia.filt.gff -b <(grep -w mRNA /scratch/farman/STARFISH/Arcadia_processed.gff) -wao >> Arcadia_SF_starships/Arcadia.intersect.gff'

[Tue Aug 6 08:33:45 2024] error: could not execute bedtools intersect on commandline for Arcadia_SF_starships/Arcadia.filt.gff and /scratch/farman/STARFISH/Arcadia_processed.gff, exiting..

I tried to bypass the failure by running the above bedtools command directly from the command line. However, I am suspecting that the subsequent failure to identify elements may result from the absence of the intersect.ids file, or possibly because starfish annotate is supposed to perform other operations after bedtools intersect. I am unable to troubleshoot the issue because intermediate "files" are only held in memory and not written to disk.