NBISweden / IgDiscover-legacy

Analyze antibody repertoires and discover new V genes from high-throughput sequencing reads
https://www.igdiscover.se
MIT License
17 stars 10 forks source link

Parse Flash stats fails #102

Closed souptonuts closed 10 months ago

souptonuts commented 5 years ago

When I try to run a data analysis I get this error:

Error in rule parse_flash_stats: jobid: 0 output: stats/reads.json

SystemExit in line 130 of /usr/local/lib/python3.6/dist-packages/igdiscover-0.12.dev68+g7f6179b-py3.6.egg/igdiscover/Snakefile: Could not parse the FLASH log file File "/usr/lib/python3.6/concurrent/futures/thread.py", line 56, in run File "/usr/local/lib/python3.6/dist-packages/igdiscover-0.12.dev68+g7f6179b-py3.6.egg/igdiscover/Snakefile", line 130, in __rule_parse_flash_stats Exiting because a job execution failed. Look above for error message Shutting down, this might take some time. Exiting because a job execution failed. Look above for error message Complete log: .snakemake/log/2019-08-16T114307.922822.snakemake.log

I am not familiar with Snakemake workflows and rules. My workaround was to copy and paste the fasta_parse rule section of Snakemake file into a stand alone python script. I had to modify variable names to get this to work as a python script. IgDiscover runs to completion after executing this python script.

#! /usr/bin/env python3

import re
import shutil
import textwrap
import json
import dnaio
import igdiscover
from igdiscover.dna import reverse_complement
from igdiscover.utils import relative_symlink
from igdiscover.config import Config, GlobalConfig
from collections import OrderedDict

input_log='reads/2-flash.log'
output_json='stats/reads.json'

total_ex = re.compile(r'\[FLASH\]\s*Total pairs:\s*([0-9]+)')
merged_ex = re.compile(r'\[FLASH\]\s*Combined pairs:\s*([0-9]+)')
with open(input_log) as f:
    for line in f:
        match = total_ex.search(line)
        if match:
            total = int(match.group(1))
            continue
        match = merged_ex.search(line)
        if match:
            merged = int(match.group(1))
            break
    else:
        sys.exit('Could not parse the FLASH log file')
d = OrderedDict({'total': total})
d['merged'] = merged
d['merging_was_done'] = True
with open(output_json, 'w') as f:
    json.dump(d, f)
marcelm commented 5 years ago

This is the same as #101, right? Did you see that I commented on that issue? The problem is already fixed, you just need to git pull to get the fix (commit ea312ebcc5c0f2522a66).

souptonuts commented 5 years ago

Yes this is a duplicate. I edited the Snakemake in my version without doing a git pull and it didn’t work.

The new git version works correctly.

Thanks for your help

On Aug 17, 2019, at 12:15 PM, Marcel Martin notifications@github.com wrote:

This is the same as #101, right? Did you see that I commented on that issue? The problem is already fixed, you just need to git pull to get the fix (commit ea312eb).

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

marcelm commented 10 months ago

This repository is outdated and is going to be archived. Please see the new repository at https://gitlab.com/gkhlab/igdiscover22/ or the homepage at https://www.igdiscover.se/ for the most recent and maintained IgDiscover version.