Closed boulund closed 3 years ago
One thing that could be annoying is that it would probably increase the total runtime of BACTpipe dramatically. I think InterProScan 5 will take quite a while to process a large number of protein sequences.
I think running InterProScan is outside the scope of BACTpipe, closing this. We can reopen if we want to discuss this more in the future.
Would it be interesting to run InterProScan 5 on the protein sequences provided by prokka?
It is a fairly sizeable download (almost 9GB), but it comes with the following databases out of the box:
I just thought of it this morning, after hearing from Jonatan that he wanted to search for some PROSITE patterns in his data.
We could consider adding an optional step at the end of the workflow that will detect if InterProScan is installed and just run a default run of InterProScan on the available protein sequences from prokka for every sample.