johnlees / seer

sequence element (kmer) enrichment analysis
GNU General Public License v2.0
43 stars 9 forks source link

filter_seer doesn't return a file with a header #63

Open Mishmash-su opened 7 years ago

Mishmash-su commented 7 years ago

If you use the command: for i in $(find . -name stdout); do filter_seer -k $i --pos_beta | sed '1d' >> seer_filtered.txt; done

as in the tutorial, the file that comes out doesn't have a header. This causes problems when trying to write the kmers to fastq(deletes the first kmer because it believes the first line is a header) as well as causes a lot of mismatch errors between the sam file and seer_filtered.txt to occur when trying to map to phandango.

But actually, running that command without the sed '1d' does not give a file with a header either, indicating to me that it is just deleting something else. Comparisons with the two output files shows extra kmers in the file run without sed '1d.' The difference is 14 lines in my case.

If relevant, this version of seer was installed sometime last fall.

johnlees commented 7 years ago

Apologies, this is a bit of a hack around the fact that if run over multiple files they will each have a header row, and in the resulting file only one header row (at the top) is required.

A command like

cat <(head -1 seer.1.txt) <(for i in $(find . -name stdout); do filter_seer -k $i --pos_beta | sed '1d' ) > seer_filtered.txt; done

should work I think