Open jiangyoufeng opened 1 year ago
yes, you can replace blastx with diamond's blastx in stage_two. the results should not be largely different if you use the same cutoff.
Thanks for your reply!
Hi! I am interested in this replacememt, but unfortunately I am not familiar with python, I would appreciate it if you could give me a specific modification Sincerely
Hi! Am I right to edit the code like below 1)only change the corresponding parameters in this area and do nothing to the else.
2)exchange the stage_two.py of these directions 3)install the diamond to the environment Sincerely
Yes you only need to replace subprocess.run([...])
with your diamond arguments. Please note the name of arguments may not be exactly identical, e.g. -mt_mode
is not used by diamond.
Hi After some attempts, it didn't work. First,I changed subprocess.run([...]) to this:
__def extract_seqs(self): ''' Extract target sequences using more stringent cutoffs & blast. ''' logger.info(f'Processing <{self.setting.extracted}> ...') nbps, nlines = simple_count(self.setting.extracted) blast_mode = 'blastx' if self.dbtype == 'prot' else 'blastn'
logger.info('Extracting target sequences using BLAST ...')
logger.info(f'BLAST settings: {nbps} bps, {nlines} reads, {self.thread} threads')
subprocess.run([
'diamond',
blast_mode,
'--db', self.db,
'-q', self.setting.extracted,
'-o', self.setting.blastout,
'--outfmt', ' '.join(['6'] + self.setting.columns),
'-e', str(self.e),
'--max-target-seqs', '5',
'-p', str(self.thread)])
but I get this error:"Error: Invalid output format: 6 qseqid sseqid pident length qlen slen evalue bitscore"
I can not figure it out SO I just delete the"'--outfmt', ' '.join(['6'] + self.setting.columns)," to find out could it run.
But I get another error :"Opening the database... Error: This executable was not compiled with support for BLAST databases.
try this one:
subprocess.run([
'diamond',
blast_mode,
'--db', self.db + '.dmnd',
'-q', self.setting.extracted,
'-o', self.setting.blastout,
'--outfmt', '6'] + self.setting.columns + [
'-e', str(self.e),
'--max-target-seqs', '5',
'-p', str(self.thread)])
Hi,
When my data volume is large, I want to accelerate the alignment process. So I want to ask is it reasonable to use diamond(blastx) instead of blast(blastx) in ARG-OAP ?For example, is it reasonable if I change the codes in the .py to make the alignment done by diamond?
Thanks!