XiaoTaoWang / EagleC

A deep-learning framework for predicting a full range of structural variations from bulk and single-cell contact maps
Other
52 stars 8 forks source link

OSError file signature not found #12

Open ValentinaBoP opened 2 years ago

ValentinaBoP commented 2 years ago

Hi,

I'm trying to run EagleC for a non-model organism but I'm getting this error:

predictSV --hic-5k $PAIRS5 --hic-10k $PAIRS10 --hic-50k $PAIRS50 -O uraCya_self -g other --balance-type ICE --output-format full --prob-cutoff-5k 0.8 --prob-cutoff-10k 0.8 --prob-cutoff-50k 0.99999 --logFile uraCya_self_eaglec.log

root                      INFO    @ 09/21/22 15:45:31: 
# ARGUMENT LIST:
# Cool URI at 5kb = uraCya_HiC_matrix_5000_balanced.cool
# Cool URI at 10kb = uraCya_HiC_matrix_10000_balanced.cool
# Cool URI at 50kb = uraCya_HiC_matrix_50000_balanced.cool
# Balance Type = ICE
# Reference Genome = other
# Included Chromosomes = ['#', 'X']
# Probability Cutoff for 5kb SVs = 0.8
# Probability Cutoff for 10kb SVs = 0.8
# Probability Cutoff for 50kb SVs = 0.99999
# Output File Prefix = uraCya_self
# Output Format = full
# Log file name = uraCya_self_eaglec.log
root                      INFO    @ 09/21/22 15:45:31: Predict SVs at 5kb resolution ...
Traceback (most recent call last):
  File "/home/vpeona/.conda/envs/EagleC/bin/predictSV-single-resolution", line 276, in <module>
    run()
  File "/home/vpeona/.conda/envs/EagleC/bin/predictSV-single-resolution", line 116, in run
    clr = cooler.Cooler(args.hic)
  File "/home/vpeona/.conda/envs/EagleC/lib/python3.8/site-packages/cooler/api.py", line 80, in __init__
    self._refresh()
  File "/home/vpeona/.conda/envs/EagleC/lib/python3.8/site-packages/cooler/api.py", line 84, in _refresh
    with open_hdf5(self.store, **self.open_kws) as h5:
  File "/home/vpeona/.conda/envs/EagleC/lib/python3.8/contextlib.py", line 113, in __enter__
    return next(self.gen)
  File "/home/vpeona/.conda/envs/EagleC/lib/python3.8/site-packages/cooler/util.py", line 576, in open_hdf5
    fh = h5py.File(fp, mode, *args, **kwargs)
  File "/home/vpeona/.conda/envs/EagleC/lib/python3.8/site-packages/h5py/_hl/files.py", line 406, in __init__
    fid = make_fid(name, mode, userblock_size,
  File "/home/vpeona/.conda/envs/EagleC/lib/python3.8/site-packages/h5py/_hl/files.py", line 173, in make_fid
    fid = h5f.open(name, flags, fapl=fapl)
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "h5py/h5f.pyx", line 88, in h5py.h5f.open
OSError: Unable to open file (file signature not found)
Traceback (most recent call last):
  File "/home/vpeona/.conda/envs/EagleC/bin/predictSV", line 176, in <module>
    run()
  File "/home/vpeona/.conda/envs/EagleC/bin/predictSV", line 112, in run
    subprocess.check_call(' '.join(command), shell=True)
  File "/home/vpeona/.conda/envs/EagleC/lib/python3.8/subprocess.py", line 364, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command 'predictSV-single-resolution -H uraCya_HiC_matrix_5000_balanced.cool --balance-type ICE -O uraCya_self.CNN_SVs.5K.txt --genome other --output-format full -C "#" "X" --prob-cutoff 0.8 --logFile uraCya_self_eaglec.log' returned non-zero exit status 1.

Can you help me to understand how to fix it?

Thank you! Valentina

XiaoTaoWang commented 2 years ago

Hi Valentina,

The error tells that your input files are not valid .cool files. Can you double check your file source and make sure they were generated by the cooler package or tools that use cooler (https://github.com/open2c/cooler) as a dependent package?

Xiaotao

ValentinaBoP commented 2 years ago

I produced the cool files myself like this:

# map reads
bwa mem -SP5M -t 16 $REF $R1 $R2 | samtools view -bhS - > $BAM

# filter reads
samtools view -h $BAM | pairtools parse -c $REF.fai -o $PAIREDSAM
pairtools sort --nproc 16 -o $SORTEDSAM $PAIREDSAM
pairtools dedup --mark-dups -o $DEDUPSAM $SORTEDSAM
pairtools select '(pair_type == "UU") or (pair_type == "UR") or (pair_type == "RU")' -o $FILTEREDSAM $DEDUPSAM
pairtools split --output-pairs $PAIRS $FILTEREDSAM
pairix -p pairs $PAIRS

for SIZE in 5000 10000 50000
do
 cooler cload pairix $REF.fai:$SIZE $PAIRS ${PREFIX}_${SIZE}.cool
 cooler balance --stdout ${PREFIX}_${SIZE}.cool > ${PREFIX}_${SIZE}_balanced.cool
done