bgruening / galaxytools

:microscope::books: Galaxy Tool wrappers
MIT License
116 stars 234 forks source link

fgrep (grep -F) option #1184

Open eschen42 opened 2 years ago

eschen42 commented 2 years ago

fgrep functionality (available with grep -F) allows searching for m multiple fixed strings among n sequences in O(n) time rather than O(n*m) by leveraging the Aho-Corasick algorithm. For a concrete example, I have a fasta_to_tabular result (20,000 lines) that I want to search for many accession IDs (8,000); or, I might just as easily wish to search for a large number of arbitrary peptide sequences.

So, my issue (or question) is the approach to take:

@bgruening Would you suggest that I submit a PR for the "Search in textfiles (grep)" tool?

bgruening commented 2 years ago

@bgruening Would you suggest that I submit a PR for the "Search in textfiles (grep)" tool?

Yes, I think so :)

Thanks and sorry for my late reply.