bmvdgeijn / WASP

WASP: allele-specific pipeline for unbiased read mapping and molecular QTL discovery
Apache License 2.0
103 stars 51 forks source link

Run the example extract_haplotype_read_counts, error IOError #73

Open yjzhang1020 opened 6 years ago

yjzhang1020 commented 6 years ago

Hi Graham,

I tried to run the example extract_haplotype_read_stock.py, and I got an error message like this.

IOError: ``/H3K27ac/read_counts.18505.h5`` does not exist

I'm sure the file exists, and I'm using the file in the example you gave. I tried to solve the problem for a long time, but still got nothing.Can you give me some guidance? Thank you .The following is the detailed error message.


$for INDIVIDUAL in $(cat $H3K27AC_SAMPLES_FILE)
> do
>     echo $INDIVIDUAL
>     python $WASP/CHT/extract_haplotype_read_counts.py \
>        --chrom $DATA_DIR/chromInfo.hg19.txt \
>        --snp_index $DATA_DIR/snp_index.h5 \
>        --snp_tab $DATA_DIR/snp_tab.h5 \
>        --geno_prob $DATA_DIR/geno_probs.h5 \
>        --haplotype $DATA_DIR/haps.h5 \
>        --samples $ALL_SAMPLES_FILE \
>        --individual $INDIVIDUAL \
>        --ref_as_counts $DATA_DIR/H3K27ac/ref_as_counts.$INDIVIDUAL.h5 \
>        --alt_as_counts $DATA_DIR/H3K27ac/alt_as_counts.$INDIVIDUAL.h5 \
>        --other_as_counts $DATA_DIR/H3K27ac/other_as_counts.$INDIVIDUAL.h5 \
>        --read_counts $DATA_DIT/H3K27ac/read_counts.$INDIVIDUAL.h5 \
>        $DATA_DIR/H3K27ac/chr22.peaks.txt.gz \
>        | gzip > $DATA_DIR/H3/haplotype_read_counts.$INDIVIDUAL.txt.gz
> done
18505
reading list of individuals from /home/zhang/WASP-0.3.0/examples/example_data/genotypes/YRI_samples.txt
Traceback (most recent call last):
  File "/home/zhang/WASP-0.3.0/CHT/extract_haplotype_read_counts.py", line 618, in <module>
    main()
  File "/home/zhang/WASP-0.3.0/CHT/extract_haplotype_read_counts.py", line 526, in main
    data_files = DataFiles(args)
  File "/home/zhang/WASP-0.3.0/CHT/extract_haplotype_read_counts.py", line 54, in __init__
    self.read_count_h5 = tables.open_file(args.read_counts, "r")
  File "/home/zhang/miniconda2/envs/p37/lib/python3.6/site-packages/tables/file.py", line 320, in open_file
    return File(filename, mode, title, root_uep, filters, **kwargs)
  File "/home/zhang/miniconda2/envs/p37/lib/python3.6/site-packages/tables/file.py", line 784, in __init__
    self._g_new(filename, mode, **params)
  File "tables/hdf5extension.pyx", line 371, in tables.hdf5extension.File._g_new
  File "/home/zhang/miniconda2/envs/p37/lib/python3.6/site-packages/tables/utils.py", line 157, in check_file_access
    raise IOError("``%s`` does not exist" % (filename,))
OSError: ``/H3K27ac/read_counts.18505.h5`` does not exist
Closing remaining open files:/home/zhang/WASP-0.3.0/examples/example_data/H3K27ac/alt_as_counts.18505.h5...done/home/zhang/WASP-0.3.0/examples/example_data/H3K27ac/other_as_counts.18505.h5...done/home/zhang/WASP-0.3.0/examples/example_data/H3K27ac/ref_as_counts.18505.h5...done
18507
reading list of individuals from /home/zhangyongjie/WASP-0.3.0/examples/example_data/genotypes/YRI_samples.txt
.....
....
yjzhang1020 commented 6 years ago

Hi Graham, I found the problem with the help of my senior brother. A variable in your example_cht_workflow. Sh was miswritten.I copied the code you wrote, causing the problem above.

for INDIVIDUAL in $(cat $H3K27AC_SAMPLES_FILE)

    #
    # create CHT input file for this individual
    #
    python $WASP/CHT/extract_haplotype_read_counts.py \
       --chrom $DATA_DIR/chromInfo.hg19.txt \
       --snp_index $DATA_DIR/snp_index.h5 \
       --snp_tab $DATA_DIR/snp_tab.h5 \
       --geno_prob $DATA_DIR/geno_probs.h5 \
       --haplotype $DATA_DIR/haps.h5 \
       --samples $ALL_SAMPLES_FILE \
       --individual $INDIVIDUAL \
       --ref_as_counts $DATA_DIR/H3K27ac/ref_as_counts.$INDIVIDUAL.h5 \
       --alt_as_counts $DATA_DIR/H3K27ac/alt_as_counts.$INDIVIDUAL.h5 \
       --other_as_counts $DATA_DIR/H3K27ac/other_as_counts.$INDIVIDUAL.h5 \
       --read_counts $DATA_DIT/H3K27ac/read_counts.$INDIVIDUAL.h5 \
       $DATA_DIR/H3K27ac/chr22.peaks.txt.gz \
       | gzip > $DATA_DIR/H3K27ac/haplotype_read_counts.$INDIVIDUAL.txt.gz

done
--read_counts $DATA_DIT/H3K27ac/read_counts.$INDIVIDUAL.h5 \
#it should be
--read_counts $DATA_DIR/H3K27ac/read_counts.$INDIVIDUAL.h5 \

And there is also a slight error in your looping grammar. I should put a "do" before line 96