liaoherui / StrainScan

High-resolution strain-level microbiome composition analysis tool based on reference genomes and k-mers
https://microbiomejournal.biomedcentral.com/articles/10.1186/s40168-023-01615-w
MIT License
32 stars 4 forks source link

system calls: not report on errors #3

Closed nick-youngblut closed 1 year ago

nick-youngblut commented 1 year ago

It appears that system calls (eg., calling dashing) are not done in a way that returns the stderr if the child job fails. For example:

def construct_matrix(input_fa): 
    '''
    pwd=os.getcwd()
    dir_work=os.path.split(os.path.abspath(__file__))[0]
    os.chdir(dir_work)
    '''
    path_file='genome_path_tem.txt'
    o=open(path_file,'w+')
    for filename in os.listdir(input_fa):
        o.write(input_fa+'/'+filename+'\n')
    o.close()
    #pwd=os.getcwd()    
    dir_dash=os.path.split(os.path.abspath(__file__))[0]+'/dashing_s128'
    #print(dir_dash+' dist -p10 -k31 -Odistance_matrix.txt -osize_estimates.txt -Q '+path_file+' -F '+path_file)
    #print(pwd)
    #exit()
    os.system(dir_dash+' dist  -p10 -k31 -Odistance_matrix.txt -osize_estimates.txt -Q  '+path_file+' -F '+path_file)
    #exit()
    nn=rebuild_matrix('distance_matrix.txt')
    os.system('rm genome_path_tem.txt size_estimates.txt')
    #os.system('rm genome_path_tem.txt size_estimates.txt')
    return nn

Switching os.system to Popen would be helpful. For instance:

    # cmd
    cmd = dir_dash+' dist  -p10 -k31 -Odistance_matrix.txt -osize_estimates.txt -Q  '+path_file+' -F '+path_file
    # run child job
    p = Popen(cmd, shell=True, stdout=PIPE, stderr=PIPE)
    output, err = p.communicate()
    ## check for errors
    if p.returncode != 0:
        raise ValueError(f'Error running {cmd}: {err.decode()}')
nick-youngblut commented 1 year ago

More generally, using python functions instead of os.system() for system level tasks, such as:

os.system('rm -rf '+kmer_sets_l2+'/Colinear_Block')

is generally more robust, since python-level errors are easier to detect and debug (e.g., you get a trackback).

So, the code above could be re-written as:

import os
import shutil

shutil.rmtree(os.path.join(kmer_sets_l2, 'Colinear_Block'))
liaoherui commented 1 year ago

Hi, Thanks a lot for your suggestions and contributions to StrainScan! The codes have been tested & debugged by me and updated based on the codes you provided. Now, StrainScan_build has a better code style. :)