In the pub, we say, "Over time, we hope to curate a list of genes that the preHGT pipeline frequently detects as false positives and to develop a strategy to filter them out."
Originally i had thought of filtering out by annotation name. @jonathaneisen suggested that we could create a BLAST database and filter out by sequence similarity. I think this is a much better approach than going by name, wanted to record here and to continue brainstorming about potential strategies.
In the pub, we say, "Over time, we hope to curate a list of genes that the preHGT pipeline frequently detects as false positives and to develop a strategy to filter them out."
Originally i had thought of filtering out by annotation name. @jonathaneisen suggested that we could create a BLAST database and filter out by sequence similarity. I think this is a much better approach than going by name, wanted to record here and to continue brainstorming about potential strategies.