Closed wshuai294 closed 2 years ago
Hey Wang Shuai, of course! So the first step was to annotate genes with different tools and databases, as described in the paper (e.g. Resfams, Eggnog, etc). From these annotations, we mined inferred Eggnog/interproscan annotations for the presence of the terms listed below, using awk commands. This helped assigning big functional categories (e.g. Antibiotic resistance). For this, we also obviously used annotations and homology information found by each specific database/tool (e.g. Resfam family). Hope this helps!
/capsid|phage|tail|head|tape measure|antitermination/
/resolv|relax|conjug|trb|plasmid|type IV|toxin|chromosome partitioning|chromosome segregation|Resolv|Relax|Conjug|Trb|Plasmid|Type IV|Toxin|Chromosome partitioning|Chromosome segregation/
/transpos|insertion|resolv|Tra[A-Z]|Tra[0-9]|IS[0-9]|conjugate transposon|Transpos|Insertion|Resolv|Tra[A-Z]|Tra[0-9]|IS[0-9]|Conjugate transposon/
/multidrug|azole resistance|antibiotic resistance|TetR|tetracycline resistance|VanZ|betalactam|beta-lactam|antimicrob|lantibio|Multidrug|Azole resistance|Antibiotic resistance|TetR|tetr|tetR|Tetracycline resistance|VanZ|vanz|vanZ|VANZ|Betalactam|Beta-lactam|Antimicrob|Lantibio/
It helps a lot! Thank you very much.
Hello,
The valuable paper "Elevated rates of horizontal gene transfer in the industrialized human microbiome" mentioned using text mining for the assignment of genes into phage, plasmid, transposons, and antibiotic resistance.
I wonder if you could provide the gene assignment scripts publicly? That's would be very helpful.
Thank you very much.
Best, WANG Shuai