I have added a protein dataset for benchmarking. It consist of all putative protein sequences in the fungi Aureobasidium pullulans which contain a couple of hydrophobin proteins that can be detected with the following pattern:
[^C]{25,158} C [^C]{5,9} CC [^C]{4,44} C [^C]{7,23} C [^C]{5,7} CC [^C]{6,18} C [^C]{2,13}
We will use this dataset for testing and benchmarking, so it is safe to merge this branch for now.
I have added a protein dataset for benchmarking. It consist of all putative protein sequences in the fungi Aureobasidium pullulans which contain a couple of hydrophobin proteins that can be detected with the following pattern:
We will use this dataset for testing and benchmarking, so it is safe to merge this branch for now.