nanoporetech / duplex-tools

Splitting of sequence reads by internal adapter sequence search
Other
52 stars 7 forks source link

guppy_duplex ValueError: not enough values to unpack (expected 4, got 3) #51

Open dkastl1 opened 1 year ago

dkastl1 commented 1 year ago

When I try to run guppy_duplex on my fast5 files, it successfully completes simplex basecalling but then gives the error ValueError: not enough values to unpack (expected 4, got 3) when it attempts to generate the duplex pairs files. Here is the full read out before the error:

INFO:guppy_duplex:Using guppy_basecaller at guppy_basecaller
INFO:guppy_duplex:Using guppy_basecaller_duplex at guppy_basecaller_duplex
INFO:guppy_duplex:728396 reads loaded from simplex summary file
INFO:guppy_duplex:Candidate pair generation took 3.07 seconds, 32558 pairs found
Traceback (most recent call last):
  File "/home/minion/.conda/envs/nanop.v1/bin/guppy_duplex", line 8, in <module>
    sys.exit(main())
  File "/home/minion/.conda/envs/nanop.v1/lib/python3.9/site-packages/ont_guppy_duplex_pipeline/guppy_duplex.py", line 279, in main
    duplex_pipeline(args.basecaller_exe, args.duplex_basecaller_exe, args.input_path, args.save_path,
  File "/home/minion/.conda/envs/nanop.v1/lib/python3.9/site-packages/ont_guppy_duplex_pipeline/guppy_duplex.py", line 203, in duplex_pipeline
    simplex_summary = _build_pairs_file(save_path, split_reads)
  File "/home/minion/.conda/envs/nanop.v1/lib/python3.9/site-packages/ont_guppy_duplex_pipeline/guppy_duplex.py", line 175, in _build_pairs_file
    filter_candidates_to_file(neighbours, sequence_file_path, logger, split_reads=split_reads)
  File "/home/minion/.conda/envs/nanop.v1/lib/python3.9/site-packages/ont_guppy_duplex_pipeline/candidate_filtering.py", line 80, in filter_candidates_to_file
    for name, sequence, _, _ in fastx:
ValueError: not enough values to unpack (expected 4, got 3)

Thanks Domenique

ollenordesjo commented 1 year ago

Hi Domenique,

This looks like it's coming from pyfastx, and might need pinning a version of it in ont-guppy-duplex-pipeline. Could you let me know which version of pyfastx you have installed? Would also be helpful to know which other versions you have installed along ont-guppy-duplex-pipeline if you could share that.

I can feed it back to the guppy team and see if they can get a fix (probably pyfastx pinning) for it.

Cheers!

dkastl1 commented 1 year ago

Here are the versions along ont-guppy-duplex-pipeline:

Thanks! Domenique

GeoMicroSoares commented 1 year ago

@ollenordesjo any news on this? I'm getting the same error. Thanks!

Rasinj commented 1 year ago

Thanks @GeoMicroSoares, just checking internally whether pyfastx can be either pinned to a previous version or whether the code can be updated in ont_guppy_duplex_pipeline! In the meantime, can you try to manually install pyfastx 0.8.4 to see if this fixes the problem? I'm checking the readme, and it seems like pyfastx was at this time returning four values, https://github.com/lmdu/pyfastx/tree/0.8.4.

I'll let you know when I hear back

onordesjo commented 1 year ago

Hi @GeoMicroSoares, @dkastl1, it seems like there will be a pinning to pyfastx or a code change happening in the next version. That should fix it. In the meantime for anyone with similar issues, please install pyfastx v0.8.4 manually to avoid this issue.

It's worth keeping in mind that it will be sensible to transition to dorado whenever possible since it will be the main basecaller going forward. Can you let me know if it's possible to convert your fast5s to pod5s and basecall them with dorado duplex?

Thanks in advance!