elastic / ember

Elastic Malware Benchmark for Empowering Researchers
Other
948 stars 277 forks source link

mmap length is greater than file size #71

Open molo923 opened 3 years ago

molo923 commented 3 years ago

Hello, i'm very new to how ember code works, and i wanna try to run this code that suggested in the readme.md

' import ember X_train, y_train, X_test, y_test = ember.read_vectorized_features("/content/drive/MyDrive/Colab Notebooks/ember_dataset_2017_1/ember") metadata_dataframe = ember.read_metadata("/content/drive/MyDrive/Colab Notebooks/ember_dataset_2017_1/ember")

'

everytime i run it, it always gave me this error:

'

WARNING: EMBER feature version 2 were computed using lief version 0.9.0- WARNING: lief version 0.11.5-37bc2c9 found instead. There may be slight inconsistencies WARNING: in the feature calculations.

ValueError Traceback (most recent call last)

in () 1 import ember ----> 2 X_train, y_train, X_test, y_test = ember.read_vectorized_features("/content/drive/MyDrive/Colab Notebooks/ember_dataset_2017_1/ember") 3 metadata_dataframe = ember.read_metadata("/content/drive/MyDrive/Colab Notebooks/ember_dataset_2017_1/ember") 1 frames /usr/local/lib/python3.7/dist-packages/numpy/core/memmap.py in __new__(subtype, filename, dtype, mode, offset, shape, order) 262 bytes -= start 263 array_offset = offset - start --> 264 mm = mmap.mmap(fid.fileno(), bytes, access=acc, offset=start) 265 266 self = ndarray.__new__(subtype, shape, dtype=descr, buffer=mm, ValueError: mmap length is greater than file size ' any help would be appreciated, thank you!
RemingtonA commented 3 years ago

Hi,

I fixed this error by specifying which version of EMBER that I was trying to extract from.

In your case, it would be version 1.

X_train, y_train, X_test, y_test = ember.read_vectorized_features("/content/drive/MyDrive/Colab Notebooks/ember_dataset_2017_1/ember", feature_version= 1)

I hope this solves your issues

molo923 commented 3 years ago

Okay, i will try it, thank you! @RemingtonA