Closed nathanhaigh closed 11 years ago
HI Nathan,
I implemented this in ae5be3957. I believe it implements the behaviour you are looking for. The orphans file must be different than the --output file. I don't want them to be output to the same file, as some downstream tools assume that the pairs are interleaved. Let me know if there are any problems.
jared
I wanted
sga preprocess
to also output orphaned paired reads to a separate file since these can also be used downstream as single-ends and may constitute a reasonable amount of coverage, especially if strict filtering criteria are used.I don't know C++ (I was just hacking your code) so have a partial implementation for this. I've added a new option
--pe-orphans
which accepts a file as its value. Here is it's current behaviour:If
--pe-orphans
is specified with--pe-mode=0
and error is thrown (untested).If
--pe-mode=1
or--pe-mode=2
and--pe-orphans
is not specified, orphans are sent to STDOUT. This may not be the best behaviour as it isn't backward compatible since STDOUT will have mixed interleaved pairs and orphans if--out
is not specified. However my complete lack of c++ knowledge prevented me from coding something with this behaviour:If
--pe-mode=1
or--pe-mode=2
and--pe-orphans
is not specified, orphans are discarded. The same as the current behaviour.If
--pe-mode=1
or--pe-mode=2
and--pe-orphans
is specified, orphans are sent to the file irrespective of whether or not--out
was specified.