tseemann / abricate

:mag_right: :pill: Mass screening of contigs for antimicrobial and virulence genes
GNU General Public License v2.0
364 stars 90 forks source link

Extract sequence from output #139

Closed buihoangphuc412 closed 4 years ago

buihoangphuc412 commented 4 years ago
I would like to check dupA gene in H. pylori whole-genome contig
Here is the output of Abricate.
How can I extract the sequence of dupA with the position on contig?
Thank you for your help. 
#FILE   SEQUENCE    START   END STRAND  GENE    COVERAGE    COVERAGE_MAP    GAPS    %COVERAGE   %IDENTITY   DATABASE    ACCESSION   PRODUCT RESISTANCE
VN0232.fasta    NODE_15_length_34180_cov_186.244934 22513   25011   +   VN0232  1-2501/2501 ========/====== 2/2 99.92   97.96   dupA        VN0232  
VN0246.fasta    NODE_16_length_34507_cov_288.285042 3902    6400    +   VN0232  1-2501/2501 ========/====== 2/2 99.92   98.00   dupA        VN0232  
VN0268.fasta    NODE_3_length_162515_cov_137.020174 23709   26207   +   VN0232  1-2501/2501 ========/====== 2/2 99.92   98.00   dupA        VN0232  
VN0355.fasta    NODE_16_length_36471_cov_77.382865  29821   32319   -   VN0232  1-2501/2501 ========/====== 2/2 99.92   98.24   dupA        VN0232  
VN0410.fasta    NODE_2_length_165263_cov_153.866550 138786  141284  -   VN0232  1-2501/2501 ========/====== 2/2 99.92   98.12   dupA        VN0232  
VN0434.fasta    NODE_1_length_141005_cov_237.163395 134784  137281  -   VN0232  1-2501/2501 ========/====== 3/3 99.88   97.84   dupA        VN0232  
VN0448.fasta    NODE_3_length_130847_cov_94.043756  102262  104759  -   VN0232  1-2501/2501 ========/====== 3/3 99.88   97.84   dupA        VN0232  
VN0472.fasta    NODE_4_length_107006_cov_67.939867  23713   26210   +   VN0232  1-2501/2501 ========/====== 3/3 99.88   98.12   dupA        VN0232  
VN0481.fasta    NODE_1_length_356149_cov_94.221132  129425  131923  -   VN0232  1-2501/2501 ========/====== 2/2 99.92   98.48   dupA        VN0232  
VN0754.fasta    NODE_5_length_88056_cov_95.992794   4187    6217    +   VN0232  1-2032/2501 ========/====.. 1/1 81.21   98.28   dupA        VN0232  
VN1158.fasta    NODE_17_length_35828_cov_557.591899 4141    6640    +   VN0232  1-2501/2501 ========/====== 3/3 99.92   98.60   dupA        VN0232  
VN1165.fasta    NODE_1_length_439015_cov_111.504183 105724  108223  +   VN0232  1-2501/2501 ========/====== 3/3 99.92   98.12   dupA        VN0232  
VN1183.fasta    NODE_2_length_94171_cov_159.766053  67821   70319   -   VN0232  1-2501/2501 ========/====== 2/2 99.92   98.12   dupA        VN0232  
VN1192.fasta    NODE_3_length_163987_cov_733.903905 23669   26168   +   VN0232  1-2501/2501 ========/====== 3/3 99.92   97.84   dupA        VN0232  
VN1196.fasta    NODE_4_length_141889_cov_634.958839 3754    6241    +   VN0232  13-2501/2501    ========/====== 3/3 99.44   98.31   dupA        VN0232  
VN1212.fasta    NODE_1_length_369174_cov_263.023446 146750  149246  -   VN0232  1-2501/2501 ========/====== 4/4 99.84   98.16   dupA        VN0232  
VN1221.fasta    NODE_10_length_54635_cov_1107.194595    28081   30579   -   VN0232  1-2501/2501 ========/====== 2/2 99.92   98.32   dupA        VN0232  
VN1222.fasta    NODE_2_length_112518_cov_35.881716  92634   95132   +   VN0232  1-2501/2501 ========/====== 2/2 99.92   98.68   dupA        VN0232  
VN1246.fasta    NODE_3_length_87591_cov_49.668853   3885    6383    +   VN0232  1-2501/2501 ========/====== 2/2 99.92   97.84   dupA        VN0232  
VN1251.fasta    NODE_8_length_55800_cov_186.267017  29804   32269   -   VN0232  1-2501/2501 ========/====== 3/35    98.60   96.56   dupA        VN0232  
VN1288.fasta    NODE_2_length_197328_cov_58.588494  88882   91380   +   VN0232  1-2501/2501 ========/====== 2/2 99.92   98.32   dupA        VN0232  
tseemann commented 4 years ago
# for + strand
samtools faidx VN0232.fasta 
samtools faidx VN0232.fasta NODE_15_length_34180_cov_186.244934:22513-25011

# for - strand
samtools faidx VN1251.fasta
samtools faidx VN1251.fasta -i NODE_8_length_55800_cov_186.267017:29804-32269
buihoangphuc412 commented 4 years ago

Thank you for your scripts. I can export my sequence into fasta manually samtools faidx VN0232.fasta NODE_15_length_34180_cov_186.244934:22513-25011 > dupA.fasta