hyeshik / poreplex

A versatile sequenced read processor for nanopore direct RNA sequencing
Other
79 stars 14 forks source link

File could not be opened due to unknown error #3

Closed ilisem closed 5 years ago

ilisem commented 6 years ago

Hi Hyeshik,

I was trying to execute the command as below, however there was an unknown error.

moltox@moltox-desktop[moltox] poreplex -i /media/moltox/data2/Ilias/MinIon/Basecalled_OK/comtransV1/fast5 -o /media/moltox/data2/Ilias/MinIon/Basecalled_OK/comtransV1/Basecalled_adaptersremoved  --trim-adapter --keep-unsplit --fastq 

Poreplex version 0.1 by Hyeshik Chang <hyeshik@snu.ac.kr>
- Cuts nanopore direct RNA sequencing data into bite-size pieces for RNA Biology

Output directory /media/moltox/data2/Ilias/MinIon/Basecalled_OK/comtransV1/Basecalled_adaptersremoved is not empty. Clear it? (y/N) y

== Analysis settings ======================================
 * Input:    /media/moltox/data2/Ilias/MinIon/Basecalled_OK/comtransV1/fast5
 * Output:    /media/moltox/data2/Ilias/MinIon/Basecalled_OK/comtransV1/Basecalled_adaptersremoved
 * Processes:    1
 * Presets:    rna-r941.cfg
 * Basecall on-the-fly:        No (use previous analyses)
 * Trim 3' adapter:        Yes
 * Filter concatenated read:    No
 * Separate by barcode:        No
 * Real-time alignment:        No
 * FASTQ in output:        Yes
 * FAST5 in output:        No    
 * Basecall table in output:    No
===========================================================

==> Processing FAST5 files
| 100% of 134804 |##############################| Elapsed: 0:07:44 Time: 0:07:44

==> Finished.
== Result Summary ==
 * Successfully processed:    0
 * Possibly artifact:        0
 * Processing failed:        0
 * Failed to open:        134804
    - File could not be opened due to unknown error:    134804

Then I noticed, I still used version 0.1, so I figured this might explain the error. I tried to update the package, using the following command, however the update remains unsuccesful. pip3 install git+https://github.com/hyeshik/poreplex.git I wonder how this error could be fixed. Thank you.

yongkuk commented 6 years ago

Please try pip3 install --upgrade poreplex

hyeshik commented 6 years ago

Hi ilisem,

Thank you for the detailed report! The new update 81d7bc3 now prints the exact exception messages of the “unknown” errors. Can you please post the error message written to {OUTPUTDIR}/poreplex.log here?

If the banner shows the old version number even after the update, it is the best bet to uninstall and remove multiple times using pip and reinstall poreplex freshly with the --no-cache-dir option to the pip command.

ilisem commented 6 years ago

Hi Hyeshik,

Thanks for the quick reply. I managed to successfully update poreplex to the latest version. However, when I tried to run the command again, I got the same error message:

moltox@moltox-desktop[moltox] poreplex -i /media/moltox/data2/Ilias/MinIon/Basecalled_OK/comtransV1/fast5 -o /media/moltox/data2/Ilias/MinIon/Basecalled_OK/comtransV1/Basecalled_adaptersremoved  --trim-adapter --keep-unsplit --fastq 

Poreplex version 0.2.1a1 by Hyeshik Chang <hyeshik@snu.ac.kr>
- Cuts nanopore direct RNA sequencing data into bite-size pieces for RNA Biology

Output directory /media/moltox/data2/Ilias/MinIon/Basecalled_OK/comtransV1/Basecalled_adaptersremoved is not empty. Clear it? (y/N) y

== Analysis settings ======================================
 * Input:    /media/moltox/data2/Ilias/MinIon/Basecalled_OK/comtransV1/fast5    
 * Output:    /media/moltox/data2/Ilias/MinIon/Basecalled_OK/comtransV1/Basecalled_adaptersremoved
 * Processes:    1
 * Presets:    rna-r941.cfg
 * Basecall on-the-fly:        No (use previous analyses)
 * Trim 3' adapter:        Yes
 * Filter concatenated read:    No
 * Separate by barcode:        No
 * Real-time alignment:        No
 * FASTQ in output:        Yes
 * FAST5 in output:        No    
 * Basecall table in output:    No
===========================================================

==> Processing FAST5 files
| 100% of 134804 |#####################################################| Elapsed: 0:08:40 Time: 0:08:40

==> Finished.
== Result Summary ==
 * Successfully processed:    0
 * Possibly artifact:        0
 * Processing failed:        0
 * Failed to open:        134804
    - File could not be opened due to unknown error:    134804

So I had a look at the poreplex.log file as you suggested. And this was part of the output (the same error message was being displayed for every folder througout the file).

2018-08-07 08:37:04,335 Starting poreplex version 0.2.1a1
2018-08-07 08:37:04,335 Command line: /home/moltox/anaconda3/bin/poreplex -i /media/moltox/data2/Ilias/MinIon/Basecalled_OK/comtransV1/fast5 -o /media/moltox/data2/Ilias/MinIon/Basecalled_OK/comtransV1/Basecalled_adaptersremoved --trim-adapter --keep-unsplit --fastq
2018-08-07 08:37:04,336 == Analysis settings ======================================
2018-08-07 08:37:04,336  * Input: /media/moltox/data2/Ilias/MinIon/Basecalled_OK/comtransV1/fast5 
2018-08-07 08:37:04,336  * Output: /media/moltox/data2/Ilias/MinIon/Basecalled_OK/comtransV1/Basecalled_adaptersremoved
2018-08-07 08:37:04,336  * Processes: 1
2018-08-07 08:37:04,336  * Presets: rna-r941.cfg
2018-08-07 08:37:04,336  * Basecall on-the-fly:     No (use previous analyses)
2018-08-07 08:37:04,337  * Trim 3' adapter:     Yes
2018-08-07 08:37:04,337  * Filter concatenated read: No
2018-08-07 08:37:04,337  * Separate by barcode:     No
2018-08-07 08:37:04,337  * Real-time alignment:     No
2018-08-07 08:37:04,337  * FASTQ in output:     Yes
2018-08-07 08:37:04,338  * FAST5 in output:     No 
2018-08-07 08:37:04,338  * Basecall table in output: No
2018-08-07 08:37:04,338 ===========================================================
2018-08-07 08:37:04,338 
2018-08-07 08:37:04,966 [signal_analyzer.py:121] Unhandled exception KeyError: 'No group: None found in file: /media/moltox/data2/Ilias/MinIon/Basecalled_OK/comtransV1/fast5/029/bw09c011_20180413_FAH66965_MN18414_sequencing_run_CommunitytransV1_31515_read_58816_ch_21_strand.fast5'
Traceback (most recent call last):
  File "/home/moltox/anaconda3/lib/python3.6/site-packages/poreplex/signal_analyzer.py", line 121, in process
    return SignalAnalysis(filename, self).process()
  File "/home/moltox/anaconda3/lib/python3.6/site-packages/poreplex/signal_analyzer.py", line 423, in process
    events = self.load_events()
  File "/home/moltox/anaconda3/lib/python3.6/site-packages/poreplex/signal_analyzer.py", line 230, in load_events
    events = self.load_events_from_fast5()
  File "/home/moltox/anaconda3/lib/python3.6/site-packages/poreplex/signal_analyzer.py", line 246, in load_events_from_fast5
    with Basecall1DTools(self.fast5) as bcall:
  File "/home/moltox/anaconda3/lib/python3.6/site-packages/ont_fast5_api/analysis_tools/base_tool.py", line 51, in __init__
    raise KeyError('No group: {} found in file: {}'.format(group_name, self.filename))
KeyError: 'No group: None found in file: /media/moltox/data2/Ilias/MinIon/Basecalled_OK/comtransV1/fast5/029/bw09c011_20180413_FAH66965_MN18414_sequencing_run_CommunitytransV1_31515_read_58816_ch_21_strand.fast5'

2018-08-07 08:37:04,967 [signal_analyzer.py:121] Unhandled exception KeyError: 'No group: None found in file: /media/moltox/data2/Ilias/MinIon/Basecalled_OK/comtransV1/fast5/029/bw09c011_20180413_FAH66965_MN18414_sequencing_run_CommunitytransV1_31515_read_24352_ch_101_strand.fast5'
Traceback (most recent call last):
  File "/home/moltox/anaconda3/lib/python3.6/site-packages/poreplex/signal_analyzer.py", line 121, in process
    return SignalAnalysis(filename, self).process()
  File "/home/moltox/anaconda3/lib/python3.6/site-packages/poreplex/signal_analyzer.py", line 423, in process
    events = self.load_events()
  File "/home/moltox/anaconda3/lib/python3.6/site-packages/poreplex/signal_analyzer.py", line 230, in load_events
    events = self.load_events_from_fast5()
  File "/home/moltox/anaconda3/lib/python3.6/site-packages/poreplex/signal_analyzer.py", line 246, in load_events_from_fast5
    with Basecall1DTools(self.fast5) as bcall:
  File "/home/moltox/anaconda3/lib/python3.6/site-packages/ont_fast5_api/analysis_tools/base_tool.py", line 51, in __init__
    raise KeyError('No group: {} found in file: {}'.format(group_name, self.filename))
KeyError: 'No group: None found in file: /media/moltox/data2/Ilias/MinIon/Basecalled_OK/comtransV1/fast5/029/bw09c011_20180413_FAH66965_MN18414_sequencing_run_CommunitytransV1_31515_read_24352_ch_101_strand.fast5'
hyeshik commented 6 years ago

It seems that your fast5 files were not basecalled yet. What happens if you add --basecall option to the command line? The option requires the albacore package from the ONT.

ilisem commented 6 years ago

Thanks for your reply! Now it worked. I only generated fastq files during basecalling. I forgot the fast5files were the original, unbasecalled files. Thank you for taking your time in helping me!

Kind regards

hyeshik commented 6 years ago

Thank you for the time for troubleshooting!

I see the current error message is not clear enough. I'll fix the next version of poreplex to show more explicit one for this type of errors.

bvs commented 5 years ago

Hello,

I am getting error when I run poreplex. Please find below the output of poreplex and let me know the solution if possible. Many thanks, Suresh


poreplex -i data/ -o test3/ --trim-adapter -p 4 --basecall

Poreplex version 0.1 by Hyeshik Chang hyeshik@snu.ac.kr

Output directory test3/ is not empty. Clear it? (y/N) y

== Analysis settings ======================================

==> Processing FAST5 files /home/bonthas/python/lib/python3.5/site-packages/poreplex/basecall_albacore.py:83: FutureWarning: from_items is deprecated. Please use DataFrame.from_dict(dict(items), ...) instead. DataFrame.from_dict(OrderedDict(items)) may be used to preserve the key order. for field in field_names]) /home/bonthas/python/lib/python3.5/site-packages/poreplex/basecall_albacore.py:83: FutureWarning: from_items is deprecated. Please use DataFrame.from_dict(dict(items), ...) instead. DataFrame.from_dict(OrderedDict(items)) may be used to preserve the key order. for field in field_names]) /home/bonthas/python/lib/python3.5/site-packages/poreplex/basecall_albacore.py:83: FutureWarning: from_items is deprecated. Please use DataFrame.from_dict(dict(items), ...) instead. DataFrame.from_dict(OrderedDict(items)) may be used to preserve the key order. for field in field_names]) /home/bonthas/python/lib/python3.5/site-packages/poreplex/basecall_albacore.py:83: FutureWarning: from_items is deprecated. Please use DataFrame.from_dict(dict(items), ...) instead. DataFrame.from_dict(OrderedDict(items)) may be used to preserve the key order. for field in field_names])

==> Terminated. Exception in thread Thread-2: Traceback (most recent call last): File "/home/bonthas/python/lib/python3.5/threading.py", line 914, in _bootstrap_inner self.run() File "/home/bonthas/python/lib/python3.5/threading.py", line 862, in run self._target(*self._args, **self._kwargs) File "/home/bonthas/python/lib/python3.5/concurrent/futures/process.py", line 295, in _queue_management_worker shutdown_worker() File "/home/bonthas/python/lib/python3.5/concurrent/futures/process.py", line 253, in shutdown_worker call_queue.put_nowait(None) File "/home/bonthas/python/lib/python3.5/multiprocessing/queues.py", line 129, in put_nowait return self.put(obj, False) File "/home/bonthas/python/lib/python3.5/multiprocessing/queues.py", line 83, in put raise Full queue.Full

hyeshik commented 5 years ago

Hi @bvs,

The error might be related to issue #8, which is fixed in today's new release, 0.3.1. Could you please try it again after updating poreplex?