medema-group / BiG-MAP

Other
24 stars 7 forks source link

ERROR: could not open "mash_sketch.msh" for reading. #27

Closed mmpust closed 5 months ago

mmpust commented 5 months ago

Hi, I am running the command:

python3 BiG-MAP/src/BiG-MAP.family.py \
     -D gutsmash_out \
     -O BiGMAP_family \
     -p $CPUS \
     -b BiG-MAP/BiG-SCAPE \
     -pf BiG-MAP/BiG-SCAPE \
     --metatranscriptomes
___________Extracting fasta files__________
ERROR: could not open "BiGMAP_family/mash_sketch.msh" for reading.
_________Adding housekeeping genes_________
[]
________Preparing BiG-SCAPE input__________
__________Running BiG-SCAPE________________

any ideas what the problem is with mash_sketch.msh? Thanks!

mmpust commented 5 months ago

oh so log file says: /bin/sh: 1: mash: Argument list too long but should not be a problem based on https://github.com/marbl/Mash/issues/179

mmpust commented 5 months ago

Okay, can be solved by writing argument list to text file and then taking text files as input for mash.

def make_sketch(outdir, kmer, sketch):
    """
    Calculates the distance between the query fasta files
    stored in the sketch file by using mash.
    Parameters
    ----------
    outdir
        string, the path to output directory
    option
        string, either 'GC' for the gene clusters or 'HG' for the housekeeping genes
    returns
    ----------
    """
    outlogfile = os.path.join(outdir, 'log.file')
    with open(outlogfile, "wb") as log_file:
        try:
            outfile = os.path.join(outdir, 'mash_sketch')

            inp_files = glob(os.path.join(outdir, 'GC_PROT*'))
            inp_txt = os.path.join(outdir, 'input_files.txt')
            with open(inp_txt, 'w') as f:
                for file in inp_files:
                    f.write(file + '\n')
            cmd_mash = f"mash sketch -o {outfile} -k {kmer} -p 1 -s {sketch} -l {inp_txt}"
            #inp = os.path.join(outdir, 'GC_PROT*')
            #cmd_mash = f"mash sketch -o {outfile} -k {kmer} -p 1 -s {sketch} -a {inp}"
            p = Popen(cmd_mash, shell=True, stdout=PIPE, stderr=PIPE)
            stdout, stderr = p.communicate()
            log_file.write(stderr)

        except(subprocess.CalledProcessError):
            # Raise error here for error table
            pass