jts / nanopolish

Signal-level algorithms for MinION data
MIT License
557 stars 160 forks source link

HDF5-DIAG: Error detected in HDF5 (1.8.14) thread 140184755549952: #811

Closed hm-git-hub closed 3 years ago

hm-git-hub commented 4 years ago

Hi, When I use the nanopolish 0.13.2 to detect the polyA tail, I get saome issues. I can finish nanopolish index. Then, I run 'nanopolish polya'. The errors are as follows: How can I fix it? Thanks!

[readdb] indexing ./fast5 [readdb] num reads: 879679, num reads with path to fast5: 879679 [warning] fast5 file is unreadable and will be skipped: ./fast5/FAN04901_90bfe2e3_34.fast5 HDF5-DIAG: Error detected in HDF5 (1.8.14) thread 140184755549952:

000: H5F.c line 604 in H5Fopen(): unable to open file

major: File accessibilty
minor: Unable to open file

001: H5Fint.c line 990 in H5F_open(): unable to open file: time = Sat Jul 25 11:32:07 2020

, name = './fast5/FAN04901_90bfe2e3_29.fast5', tent_flags = 0

gRNA_polya.tsv 1023cb2d-9f44-4b7a-8bde-69892775ad16 NC_045512.2 29778 -1.0 -1.0 -1.0 -1.0 -1.00 -1.00READ_FAILED_LOAD

jts commented 4 years ago

Hi,

Does the program continue, or does it terminate at that point? It appears there is a problem with that file, but it should be skipped.

Jared

On Thu, Jul 30, 2020 at 10:22 PM hm-git-hub notifications@github.com wrote:

Hi, When I use the nanopolish 0.13.2 to detect the polyA tail, I get saome issues. I can finish nanopolish index. Then, I run 'nanopolish polya'. The errors are as follows: How can I fix it? Thanks!

[readdb] indexing ./fast5 [readdb] num reads: 879679, num reads with path to fast5: 879679 [warning] fast5 file is unreadable and will be skipped: ./fast5/FAN04901_90bfe2e3_34.fast5 HDF5-DIAG: Error detected in HDF5 (1.8.14) thread 140184755549952:

000: H5F.c line 604 in H5Fopen(): unable to open file

major: File accessibilty minor: Unable to open file

1 https://github.com/jts/nanopolish/issues/1: H5Fint.c line 990 in

H5F_open(): unable to open file: time = Sat Jul 25 11:32:07 2020 , name = './fast5/FAN04901_90bfe2e3_29.fast5', tent_flags = 0

gRNA_polya.tsv 1023cb2d-9f44-4b7a-8bde-69892775ad16 NC_045512.2 29778 -1.0 -1.0 -1.0 -1.0 -1.00 -1.00READ_FAILED_LOAD

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/jts/nanopolish/issues/811, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAC7DH7ASJEMRNDCRZGV53DR6ITGFANCNFSM4PPFSUVA .

hm-git-hub commented 4 years ago

Thank you for you reply. The program can continue. But every fast5 file can cause the error. I run 'nanopolish polya’ to detect polyA. I also can get the .tsv file. But every read is ‘READ_FAILED_LOAD’.

Best wishes!


Ming He Beijing Institute of Genomics (BIG), Chinese Academy of Sciences

2020年7月31日 10:25,Jared Simpson notifications@github.com 写道:

Hi,

Does the program continue, or does it terminate at that point? It appears there is a problem with that file, but it should be skipped.

Jared

On Thu, Jul 30, 2020 at 10:22 PM hm-git-hub notifications@github.com wrote:

Hi, When I use the nanopolish 0.13.2 to detect the polyA tail, I get saome issues. I can finish nanopolish index. Then, I run 'nanopolish polya'. The errors are as follows: How can I fix it? Thanks!

[readdb] indexing ./fast5 [readdb] num reads: 879679, num reads with path to fast5: 879679 [warning] fast5 file is unreadable and will be skipped: ./fast5/FAN04901_90bfe2e3_34.fast5 HDF5-DIAG: Error detected in HDF5 (1.8.14) thread 140184755549952:

000: H5F.c line 604 in H5Fopen(): unable to open file

major: File accessibilty minor: Unable to open file

1 https://github.com/jts/nanopolish/issues/1: H5Fint.c line 990 in

H5F_open(): unable to open file: time = Sat Jul 25 11:32:07 2020 , name = './fast5/FAN04901_90bfe2e3_29.fast5', tent_flags = 0

gRNA_polya.tsv 1023cb2d-9f44-4b7a-8bde-69892775ad16 NC_045512.2 29778 -1.0 -1.0 -1.0 -1.0 -1.00 -1.00READ_FAILED_LOAD

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/jts/nanopolish/issues/811, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAC7DH7ASJEMRNDCRZGV53DR6ITGFANCNFSM4PPFSUVA .

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/jts/nanopolish/issues/811#issuecomment-666876837, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQOTOHIWAMK4LHXBIHXDEVTR6ITSHANCNFSM4PPFSUVA.

jts commented 4 years ago

In that case I suggest checking that the paths to the files are correct and that you have permission to read them.

Jared

On Thu, Jul 30, 2020 at 10:53 PM hm-git-hub notifications@github.com wrote:

Thank you for you reply. The program can continue. But every fast5 file can cause the error. I run 'nanopolish polya’ to detect polyA. I also can get the .tsv file. But every read is ‘READ_FAILED_LOAD’.

Best wishes!


Ming He Beijing Institute of Genomics (BIG), Chinese Academy of Sciences

2020年7月31日 10:25,Jared Simpson notifications@github.com 写道:

Hi,

Does the program continue, or does it terminate at that point? It appears there is a problem with that file, but it should be skipped.

Jared

On Thu, Jul 30, 2020 at 10:22 PM hm-git-hub notifications@github.com wrote:

Hi, When I use the nanopolish 0.13.2 to detect the polyA tail, I get saome issues. I can finish nanopolish index. Then, I run 'nanopolish polya'. The errors are as follows: How can I fix it? Thanks!

[readdb] indexing ./fast5 [readdb] num reads: 879679, num reads with path to fast5: 879679 [warning] fast5 file is unreadable and will be skipped: ./fast5/FAN04901_90bfe2e3_34.fast5 HDF5-DIAG: Error detected in HDF5 (1.8.14) thread 140184755549952:

000: H5F.c line 604 in H5Fopen(): unable to open file

major: File accessibilty minor: Unable to open file

1 https://github.com/jts/nanopolish/issues/1: H5Fint.c line 990 in

H5F_open(): unable to open file: time = Sat Jul 25 11:32:07 2020 , name = './fast5/FAN04901_90bfe2e3_29.fast5', tent_flags = 0

gRNA_polya.tsv 1023cb2d-9f44-4b7a-8bde-69892775ad16 NC_045512.2 29778 -1.0 -1.0 -1.0 -1.0 -1.00 -1.00READ_FAILED_LOAD

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/jts/nanopolish/issues/811, or unsubscribe < https://github.com/notifications/unsubscribe-auth/AAC7DH7ASJEMRNDCRZGV53DR6ITGFANCNFSM4PPFSUVA

.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub < https://github.com/jts/nanopolish/issues/811#issuecomment-666876837>, or unsubscribe < https://github.com/notifications/unsubscribe-auth/AQOTOHIWAMK4LHXBIHXDEVTR6ITSHANCNFSM4PPFSUVA .

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/jts/nanopolish/issues/811#issuecomment-666884472, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAC7DHZPFWVHHIGHYE3JV6DR6IW4LANCNFSM4PPFSUVA .

hm-git-hub commented 4 years ago

In order to avoid mistakes, I also have utilized the absolute path to define the fast5 files in the index. And I also check the permission. I can finish basecalling by Guppy v4.0.11. This is one of my fast5 files. Thank you!

Best wishes!


Ming He Beijing Institute of Genomics (BIG),  Chinese Academy of Sciences

 

------------------ 原始邮件 ------------------ 发件人: "jts/nanopolish" <notifications@github.com>; 发送时间: 2020年7月31日(星期五) 中午11:13 收件人: "jts/nanopolish"<nanopolish@noreply.github.com>; 抄送: "何明"<heming2014@foxmail.com>;"Author"<author@noreply.github.com>; 主题: Re: [jts/nanopolish] HDF5-DIAG: Error detected in HDF5 (1.8.14) thread 140184755549952: (#811)

In that case I suggest checking that the paths to the files are correct and that you have permission to read them.

Jared

On Thu, Jul 30, 2020 at 10:53 PM hm-git-hub <notifications@github.com> wrote:

> Thank you for you reply. > The program can continue. But every fast5 file can cause the error. > I run 'nanopolish polya’ to detect polyA. I also can get the .tsv file. > But every read is ‘READ_FAILED_LOAD’. > > > > Best wishes! > > ---------------------------------------------- > Ming He > Beijing Institute of Genomics (BIG), > Chinese Academy of Sciences > > > 2020年7月31日 10:25,Jared Simpson <notifications@github.com> 写道: > > > > > > Hi, > > > > Does the program continue, or does it terminate at that point? It appears > > there is a problem with that file, but it should be skipped. > > > > Jared > > > > On Thu, Jul 30, 2020 at 10:22 PM hm-git-hub <notifications@github.com> > > wrote: > > > > > Hi, > > > When I use the nanopolish 0.13.2 to detect the polyA tail, I get saome > > > issues. I can finish nanopolish index. Then, I run 'nanopolish polya'. > The > > > errors are as follows: > > > How can I fix it? > > > Thanks! > > > > > > [readdb] indexing ./fast5 > > > [readdb] num reads: 879679, num reads with path to fast5: 879679 > > > [warning] fast5 file is unreadable and will be skipped: > > > ./fast5/FAN04901_90bfe2e3_34.fast5 > > > HDF5-DIAG: Error detected in HDF5 (1.8.14) thread 140184755549952: > > > #000: H5F.c line 604 in H5Fopen(): unable to open file > > > major: File accessibilty > > > minor: Unable to open file > > > #1 <https://github.com/jts/nanopolish/issues/1&gt;: H5Fint.c line 990 in > > > H5F_open(): unable to open file: time = Sat Jul 25 11:32:07 2020 > > > , name = './fast5/FAN04901_90bfe2e3_29.fast5', tent_flags = 0 > > > > > > gRNA_polya.tsv > > > 1023cb2d-9f44-4b7a-8bde-69892775ad16 NC_045512.2 29778 -1.0 -1.0 -1.0 > -1.0 > > > -1.00 -1.00READ_FAILED_LOAD > > > > > > — > > > You are receiving this because you are subscribed to this thread. > > > Reply to this email directly, view it on GitHub > > > <https://github.com/jts/nanopolish/issues/811&gt;, or unsubscribe > > > < > https://github.com/notifications/unsubscribe-auth/AAC7DH7ASJEMRNDCRZGV53DR6ITGFANCNFSM4PPFSUVA > > > > > . > > > > > — > > You are receiving this because you authored the thread. > > Reply to this email directly, view it on GitHub < > https://github.com/jts/nanopolish/issues/811#issuecomment-666876837&gt;, or > unsubscribe < > https://github.com/notifications/unsubscribe-auth/AQOTOHIWAMK4LHXBIHXDEVTR6ITSHANCNFSM4PPFSUVA > >. > > > > — > You are receiving this because you commented. > Reply to this email directly, view it on GitHub > <https://github.com/jts/nanopolish/issues/811#issuecomment-666884472&gt;, or > unsubscribe > <https://github.com/notifications/unsubscribe-auth/AAC7DHZPFWVHHIGHYE3JV6DR6IW4LANCNFSM4PPFSUVA&gt; > . >

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

从QQ邮箱发来的超大附件

FAN04901_90bfe2e3_30.fast5 (364.81M, 2020年08月30日 11:24 到期)进入下载页面:http://mail.qq.com/cgi-bin/ftnExs_download?k=7c61383638c9568a23eb650a1e34564c05520f50010055551e035d505b1950010a5815575d57054e060209520906505251500c50382e6425722f08020104553c0a515a505d0601506c5208185e551717066105&t=exs_ftn_download&code=3a8684dc

ritma001 commented 3 years ago

Hi there,

I am recently having the same error. Finding some related posts out there but, none of solutions works. Is there any advice to fix the error?

Best,

Wannisa

hm-git-hub commented 3 years ago

Hi, Ultimately, I did not solve this error. Because the fast5 files are published data, I can not find the reason, causing the error. And the data, we sequence by ourselves, perform well. We also can not find the difference between the published data and our data (checking the context in fast5 file through HDF5). I guess that the MinKNOWN version or parameter maybe reflect the detail of fast5 files. If you can solve the bug, thanks you for notifying me.

Best wishes!


Ming He Beijing Institute of Genomics (BIG), Chinese Academy of Sciences

2020年9月23日 13:22,Wannisa Ritmahan notifications@github.com 写道:

Hi there,

I am recently having the same error. Finding some related posts out there but, none of solutions works. Is there any advice to fix the error?

Best,

Wannisa

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/jts/nanopolish/issues/811#issuecomment-697140693, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQOTOHKDEOXUU6RXG5MBFMDSHGA2JANCNFSM4PPFSUVA.

jts commented 3 years ago

Hi,

I have added a small program to help debug this issue. Can you please see here and follow the instructions I posted in this issue:

https://github.com/jts/nanopolish/issues/826#issuecomment-702195345

Jared

pphector commented 3 years ago

Hello, I am running into this same issue. I have tried everything suggested here and have not been successful. A summary of the issue:

The readdb file contains 43 fast5 files [fast5] OK: opened /lustre03/project/6007512/C3G/projects/Moreira_COVID19_Genotyping/MGC_ONT_processing/LSPQ_Outbreak904_PAE77942_20201028/20201029_0056_1G_PAE77942_c76d332d/fast5_pass/barcode31/PAE77942_pass_barcode31_08d9e3d8_0.fast5 [read] ERROR: could not read raw samples for 000c44a0-2640-4c89-bc6e-0b299114d55d [read] ERROR: could not read raw samples for 001db7fc-3e59-4ac5-8493-ead37b150a0a [read] ERROR: could not read raw samples for 0041c2e2-2c44-452a-8e2c-a97333b2f645 [read] ERROR: could not read raw samples for 006741f1-f603-44db-b124-cbad513fc3f3 ... This is surprising to me, because when I use the hdf5 tools directly on the fast5 files, I am able to retrieve the headers corresponding to these reads. I am exploring using the nanopore fast5 API directly to investigate these reads, and I will update when I know more. But I was wondering if there was any other solution for this kind of issue until now?

Finally, some context: this was the first run where the technician activated an option on Miknow that identifies barcodes in the middle of a sequence, which will then end up in the unclassified folder. This seems to apply to the Fast5 files as well. The analysis was done with the results of the PromethION live basecalling as well as an additional, separate basecalling and demultiplexing done in a separate server.

jts commented 3 years ago

Hi @pphector,

I suspect this has something to do with the barcoding option. Can you run h5dump -n on one of the fast5s and send the results?

Jared

pphector commented 3 years ago

Hi @jts . Here is the output of the h5dump for one of the barcodes (barcode 31), it is only the first 1000000 lines, because otherwise the file is too big for GitHub issues. I am also adding the first 1000000 lines of the unclassified fast5 files. If you need the full files I will gladly provide it through other means. Thank you so much for your help.

unclassified_fast5_items_part1.txt.gz

barcode31_fast5_items_part1.txt.gz

jts commented 3 years ago

Thanks, that looks OK to me. Can you uncomment this line and re-run after recompiling? It will add some more useful debugging messages:

https://github.com/jts/nanopolish/blob/master/src/io/nanopolish_fast5_io.cpp#L15

pphector commented 3 years ago

Here is the head of the output latest run with the additional debugging messages. Upon reading them, the issue seems to be an I/O failure. I have checked the permissions again, and they look good to me. Besides, other programs such as guppy and h5dump don't seem to have issues opening these directories/files. I also tried running this from two different filesystems in our server, didn't seem to help. I'm really at a loss as to why this is happening.

barcode31_20201101_part1.log.gz

Edit. I had uploaded the wrong file. This is the correct one now.

jts commented 3 years ago

This is the problem:

#004: H5Z.c line 1357 in H5Z_pipeline(): required filter 'vbz' is not registered

The fast5 files have been compressed with the vbz method, but the decompression plugin is not loaded. To allow nanopolish to read these files, you need to follow these instructions:

https://github.com/nanoporetech/vbz_compression/issues/5

I have an issue open to emit a better error message when the vbz plugin is not loaded, but I'm not sure yet how to check for it yet:

https://github.com/jts/nanopolish/issues/766

pphector commented 3 years ago

I can confirm now this was the issue. Installing the plugin and setting the HDF5_PLUGIN_PATH environment variable has fixed it. Thanks for your help Jared!

And I agree, a better error message in the future would be helpful, although I am following up with the lab to trace back at what point the compression was triggered, because according to them, they did not do anything different (besides the barcode detection feature) and we didn't have this issue in previous runs. It would be useful to know if anything changed in the MinKnow backend that we were not aware of.

jts commented 3 years ago

@sabiqali has added a check for this situation - it should now give a better error message when this occurs