fhcrc / seqmagick

An imagemagick-like frontend to Biopython SeqIO
http://seqmagick.readthedocs.org
GNU General Public License v3.0
113 stars 22 forks source link

Anchored --pattern-replace doesn't replace #39

Closed sjackman closed 10 years ago

sjackman commented 10 years ago

In the following example the resulting ID is bye hello friend and the expected result is bye friend.

$ cat a.fa
>hello friend
ACGT
$ seqmagick convert --pattern-replace '^hello$' 'bye' a.fa b.fa
$ cat b.fa
>bye hello friend
ACGT
cmccoy commented 10 years ago

Thanks for reporting. This is definitely a bug, partly in the docs (which specify that the sequence ID is the target for replacement, when in fact it's the whole description), and partly in the transformation. I'm changing the documentation to:

Replace regex pattern "search_pattern" with "replace_pattern" in sequence identifier

After 515f55eb7ed2607b3c216ca5c8ced4bae8df51a5, your example leaves the input unchanged; while the pattern ^hello\s generates the expected result. Is that more intuitive?

sjackman commented 10 years ago

Yeah, that seems reasonable. I'd like the same pattern to work whether or not the comment is empty. I think ^hello\b handles that, where \b matches a word boundary. Thanks, Connor.