notAI-tech / deepsegment

A sentence segmenter that actually works!
http://bpraneeth.com/projects
GNU General Public License v3.0
302 stars 56 forks source link

h5py.h5f.open OSError: Unable to open file (truncated file: eof = 16777216, sblock->base_addr = 0, stored_eof = 80443280) #19

Closed youssefavx closed 4 years ago

youssefavx commented 4 years ago

Describe the bug and error messages (if any) I've tried to run:

segmenter = DeepSegment('en') text = Path('data/bandt.txt').read_text() tokens = segmenter.segment(text)

In both python3.7 and python3.6 and I'm getting this same error:

Traceback (most recent call last):
  File "ds.py", line 4, in <module>
    segmenter = DeepSegment('en')
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/deepsegment/deepsegment.py", line 140, in __init__
    DeepSegment.seqtag_model.load_weights(checkpoint_path)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/keras/engine/saving.py", line 492, in load_wrapper
    return load_function(*args, **kwargs)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/keras/engine/network.py", line 1221, in load_weights
    with h5py.File(filepath, mode='r') as f:
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/h5py/_hl/files.py", line 408, in __init__
    swmr=swmr)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/h5py/_hl/files.py", line 173, in make_fid
    fid = h5f.open(name, flags, fapl=fapl)
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "h5py/h5f.pyx", line 88, in h5py.h5f.open
OSError: Unable to open file (truncated file: eof = 16777216, sblock->base_addr = 0, stored_eof = 80443280)

**The code snippet which gave this error***

Specify versions of the following libraries

  1. deepsegment: 2.3.0
  2. tensorflow: 1.13.1 / tensorflow-cpu: 1.15.0 / tensorflow-gpu (not installed)
  3. keras: 2.3.1

Expected behavior For the sentences in text to be tokenized

bedapudi6788 commented 4 years ago

Delete the .DeepSegment_en directory in your home folder manually and rerun. The checkpoint download didn't finish.

youssefavx commented 4 years ago

Works great now! Thanks very much!

CamilleSchr commented 3 years ago

Hi, I have quite the same problem,but I don't have the .DeepSegment_en directory in my home folder ...

Here is the bug and error messages :

Using TensorFlow backend. 2021-02-01 13:03:30.843192: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'cudart64_101.dll'; dlerror: cudart64_101.dll not found 2021-02-01 13:03:30.843436: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine. WARNING:root:Tensorflow serving is not installed. Cannot be used with tesnorflow serving docker images. WARNING:root:Run pip install tensorflow-serving-api==1.12.0 if you want to use with tf serving. 2021-02-01 13:03:35.678564: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library nvcuda.dll 2021-02-01 13:03:36.634608: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties: pciBusID: 0000:01:00.0 name: Quadro T2000 computeCapability: 7.5 coreClock: 1.62GHz coreCount: 16 deviceMemorySize: 4.00GiB deviceMemoryBandwidth: 104.34GiB/s 2021-02-01 13:03:36.635303: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'cudart64_101.dll'; dlerror: cudart64_101.dll not found 2021-02-01 13:03:36.637017: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'cublas64_10.dll'; dlerror: cublas64_10.dll not found 2021-02-01 13:03:36.644040: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'cufft64_10.dll'; dlerror: cufft64_10.dll not found 2021-02-01 13:03:36.645047: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'curand64_10.dll'; dlerror: curand64_10.dll not found 2021-02-01 13:03:36.645828: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'cusolver64_10.dll'; dlerror: cusolver64_10.dll not found 2021-02-01 13:03:36.646821: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'cusparse64_10.dll'; dlerror: cusparse64_10.dll not found 2021-02-01 13:03:36.647576: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'cudnn64_7.dll'; dlerror: cudnn64_7.dll not found 2021-02-01 13:03:36.647772: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1598] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform. Skipping registering GPU devices... 2021-02-01 13:03:36.648954: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 2021-02-01 13:03:36.668580: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x1d4397200e0 initialized for platform Host (this does not guarantee that XLA will be used). Devices: 2021-02-01 13:03:36.668786: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version 2021-02-01 13:03:36.669504: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] Device interconnect StreamExecutor with strength 1 edge matrix: 2021-02-01 13:03:36.670568: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1108] Traceback (most recent call last): File "test_deepsegment.py", line 6, in segmenter = DeepSegment('en') File "C:\Users\user\Anaconda3\lib\site-packages\deepsegment\deepsegment.py", line 175, in init DeepSegment.seqtag_model.load_weights(checkpoint_path) File "C:\Users\user\Anaconda3\lib\site-packages\keras\engine\saving.py", line 492, in load_wrapper return load_function(*args, **kwargs) File "C:\Users\user\Anaconda3\lib\site-packages\keras\engine\network.py", line 1221, in load_weights with h5py.File(filepath, mode='r') as f: File "C:\Users\user\Anaconda3\lib\site-packages\h5py_hl\files.py", line 406, in init fid = make_fid(name, mode, userblock_size, File "C:\Users\user\Anaconda3\lib\site-packages\h5py_hl\files.py", line 173, in make_fid fid = h5f.open(name, flags, fapl=fapl) File "h5py_objects.pyx", line 54, in h5py._objects.with_phil.wrapper File "h5py_objects.pyx", line 55, in h5py._objects.with_phil.wrapper File "h5py\h5f.pyx", line 88, in h5py.h5f.open OSError: Unable to open file (truncated file: eof = 16777216, sblock->base_addr = 0, stored_eof = 80443280)

Can you help me please ? Thanks !

bedapudi6788 commented 3 years ago

@CamilleSchr when initialising for the first time, DeepSegment downloads the required checkpoints from the releases section of this repo. It looks like your download was corrupted or partially downloaded.

You can download these files manually from https://github.com/notAI-tech/deepsegment/releases/download/v1.0.2/en_checkpoint https://github.com/notAI-tech/deepsegment/releases/download/v1.0.2/en_params https://github.com/notAI-tech/deepsegment/releases/download/v1.0.2/en_utils

and use them like this

from deepsegment import DeepSegment

segmenter = DeepSegment(lang_code=None, checkpoint_path=CHECKPOINT_PATH, params_path=PARAMS_PATH, utils_path=UTILS_PATH)