veg / hyphy-analyses

HyPhy standalone analyses
MIT License
37 stars 17 forks source link

control over removed sequences in remove-duplicates.bf #44

Closed Suirotras closed 1 year ago

Suirotras commented 1 year ago

Hello,

I am using the remove-duplicates.bf script to remove duplicate sequences from my alignment before my analysis. It is working as intended. However, the problem arises when a sequence is selected for removal.

My analysis is focused on one species. So, when the sequence from this species and the sequence from another species are classified as duplicates, I would like the sequence from the other species to be removed. However, I have no control over which sequence gets removed.

It would be great if I could indicate a sequence identifier that I don't want removed. However, I do not know how to add this likely simple change to the remove-duplicates.bf script.

So my question is if it is possible to add this functionality to remove-duplicates.bf?

Thanks for the help!

Sincerely, Jari

spond commented 1 year ago

Dear @Suirotras,

Great suggesting. Added as --preserve seq1,seq2,... option in the just-released v0.2.

Best, Sergei

Suirotras commented 1 year ago

Hello @spond

Thanks for adding the option! I will certainly make use of this useful option.

Cheers, Jari