lh3 / wgsim

Reads simulator
258 stars 91 forks source link

wgsim: Please add is_flip (the orientation) to the read ID #9

Open sjackman opened 9 years ago

sjackman commented 9 years ago

From @sjackman on February 26, 2015 17:51

The read ID output by wgsim does not include the orientation of the read, is_flip. It would be really useful to include the orientation.

A read ID currently has six fields separated by underscores @0_80_129_0:0:0_0:0:0_e/1

Can we add a seventh field that specifies 1 for reverse complement and 0 otherwise? I'd be happy to submit a pull request if you agree. @0_80_129_0:0:0_0:0:0_e_0/1 or @0_80_129_0:0:0_0:0:0_e_1/1

https://github.com/samtools/samtools/blob/develop/misc/wgsim.c#L345-L347

                fprintf(fpo[j], "@%s_%u_%u_%d:%d:%d_%d:%d:%d_%llx/%d\n", ks->name.s, ext_coor[0]+1, ext_coor[1]+1,
                        n_err[0], n_sub[0], n_indel[0], n_err[1], n_sub[1], n_indel[1],
                        (long long)ii, j==0? is_flip+1 : 2-is_flip);

Copied from original issue: samtools/samtools#355

sjackman commented 9 years ago

From @jkbonfield on July 8, 2015 16:19

It took some head scratching to understand how j and is_flip work there. Confusing but probably less so that 1+(j!=is_flip) :-)

From my reading of the code, is_flip represents the template orientation, correct? If so I agree that it would be useful to add it. I have no idea though what parses this name and how much we'd break by adding an additional field. Certainly wgsim_eval.pl would need changing, maybe more.

Note there is also a newer version in https://github.com/lh3/wgsim/, so it's forked out of samtools now it looks.

sjackman commented 9 years ago

Which is the canonical version of wgsim, samtools/samtools or lh3/wgsim?

sjackman commented 9 years ago

From @jkbonfield on July 9, 2015 8:25

I assume lh3/wgsim as it's newer and his README seems to imply this too. However I don't know for sure.