EGA-archive / ont2cram

Oxford Nanopore HDF/Fast5 to CRAM conversion tool
Apache License 2.0
22 stars 2 forks source link

Does this work for modern multiread FAST5 only ? #11

Open colindaven opened 4 years ago

colindaven commented 4 years ago

Hi,

this script is nice. It seemed to work for a modern mid-2019 sequenced dataset (with 4000 reads per fast5) but not for data sequenced on ONT in 2017 (1 read per fast5).

Is this to be expected ?

Thanks, Colin

AlexanderVi commented 4 years ago

Actually the tool does support single-read fast5. What kind of error do you get?

colindaven commented 4 years ago

Sorry about the late reply, Christmas is a busy time here:


# Command:
/run1/pass$ bash run_ont2cram_SLURM.sh 0/

# (contains)
srun -c 18 /mnt/ngsnfs/tools/ont2cram/ont2cram -i $fast5_dir -o $fast5_dir.cram

#  Output

Input directory:  0/
100%|██████████| 4000/4000 [02:35<00:00, 25.70it/s]
Loading Fast5 from: '/working2/tmp/tmp_tar_gz/run1/pass/0'
Writing CRAM to: '/working2/tmp/tmp_tar_gz/run1/pass/0/.cram'
Traceback (most recent call last):
  File "/mnt/ngsnfs/tools/ont2cram/ont2cram", line 5, in <module>
    sys.exit(ont2cram.main())
  File "/mnt/ngsnfs/tools/ont2cram/ont2cram.py", line 380, in main
    run(args.inputdir, args.fastqdir, args.outputfile, args.skipsignal)
  File "/mnt/ngsnfs/tools/ont2cram/ont2cram.py", line 367, in run
    write_cram( fast5_files, output_file, skip_signal, fastq_map )
  File "/mnt/ngsnfs/tools/ont2cram/ont2cram.py", line 237, in write_cram
    with pysam.AlignmentFile( cram_file, "wc", header=header, format_options=[b"no_ref=1"] ) as outf:
  File "pysam/libcalignmentfile.pyx", line 401, in pysam.libcalignmentfile.AlignmentFile.__cinit__ (pysam/libcalignmentfile.c:5835)
  File "pysam/libcalignmentfile.pyx", line 435, in pysam.libcalignmentfile.AlignmentFile._open (pysam/libcalignmentfile.c:6425)
TypeError: _open() got an unexpected keyword argument 'format_options'
srun: error: hpc-rc07: task 0: Exited with exit code 1
AlexanderVi commented 4 years ago

what is your Python version and pysam module version?

colindaven commented 4 years ago

Good point, I was probably having a conda collision before. Note the different error now - apologies.

Ubuntu 1604.

Should be using

Python 3.5.2 (default, Oct 8 2019, 13:06:37) [GCC 5.4.0 20160609] on linux Type "help", "copyright", "credits" or "license" for more information.

and

pip freeze | grep pysam

pysam==0.15.2

bash run_ont2cram_SLURM.sh 0
Input directory:  0
Loading Fast5 from: '/working2/tmp/tmp_tar_gz/run1/pass/0'
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4000/4000 [03:50<00:00, 17.37it/s]
Writing CRAM to: '/working2/tmp/tmp_tar_gz/run1/pass/0.cram'
  0%|                                                                                                                                  | 0/4000 [00:00<?, ?it/s]/home/rcug/.local/lib/python3.5/site-packages/h5py/_hl/dataset.py:313: H5pyDeprecationWarning: dataset.value has been deprecated. Use dataset[()] instead.
  "Use dataset[()] instead.", H5pyDeprecationWarning)
  1%|▋                                                                                                                        | 21/4000 [00:02<08:41,  7.64it/s]
Traceback (most recent call last):
  File "/mnt/ngsnfs/tools/ont2cram/ont2cram", line 5, in <module>
    sys.exit(ont2cram.main())
  File "/mnt/ngsnfs/tools/ont2cram/ont2cram.py", line 380, in main
    run(args.inputdir, args.fastqdir, args.outputfile, args.skipsignal)
  File "/mnt/ngsnfs/tools/ont2cram/ont2cram.py", line 367, in run
    write_cram( fast5_files, output_file, skip_signal, fastq_map )
  File "/mnt/ngsnfs/tools/ont2cram/ont2cram.py", line 317, in write_cram
    read_group.visititems( partial(process_attrs,a_s) )
  File "/home/rcug/.local/lib/python3.5/site-packages/h5py/_hl/group.py", line 563, in visititems
    return h5o.visit(self.id, proxy)
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "h5py/h5o.pyx", line 355, in h5py.h5o.visit
  File "h5py/defs.pyx", line 1594, in h5py.defs.H5Ovisit_by_name
  File "h5py/h5o.pyx", line 302, in h5py.h5o.cb_obj_simple
  File "/home/rcug/.local/lib/python3.5/site-packages/h5py/_hl/group.py", line 562, in proxy
    return func(name, self[name])
  File "/mnt/ngsnfs/tools/ont2cram/ont2cram.py", line 287, in process_attrs
    process_dataset( cram_seg, name, group_or_dset, columns )
  File "/mnt/ngsnfs/tools/ont2cram/ont2cram.py", line 267, in process_dataset
    if type(col_values[0]) is bytes:
IndexError: index out of range