cboursnell / crb-blast

Conditional Reciprocal Best Blast
40 stars 10 forks source link

Lost header information #12

Open jmpolinski opened 7 years ago

jmpolinski commented 7 years ago

My transcript headers, from Trinity assembly, are in the format ">TRINITY_DN4342|c0_g1_i1", where the g indicates the gene and i indicated isoform. In my output tsv file, all the characters after the "|" are missing. The entire identifier is still present in both blast outputs used the make the tsv, which makes me believe the character loss has to do with syntax.

Is it possible to rerun just the steps following the blast searches (i.e. finding reciprocals and writing to tsv)? If so, will changing the "|" to "_" in the headers prevent losing the last 8 characters?