I think we need to download all gbenv* files from https://ftp.ncbi.nlm.nih.gov/genbank/gbenv1 e.g. gbenv1.seq.gz - gbenv72.seq.gz, then extract them and parse all the VERSION AB000684.1 values to get a list of accessions like
AB000684.1
Then we can use that list to filter out all the environmental samples from the NT fasta files the next time we run BWA
I think we need to download all
gbenv*
files from https://ftp.ncbi.nlm.nih.gov/genbank/gbenv1 e.g.gbenv1.seq.gz - gbenv72.seq.gz
, then extract them and parse all theVERSION AB000684.1
values to get a list of accessions likeThen we can use that list to filter out all the environmental samples from the NT fasta files the next time we run BWA