h2oai / datatable

A Python package for manipulating 2-dimensional tabular data structures
https://datatable.readthedocs.io
Mozilla Public License 2.0
1.81k stars 155 forks source link

After saving and reading the .jay file twice, an error will be reported #3474

Open badchangx opened 1 year ago

badchangx commented 1 year ago

BUG: when i save and read the '.jay' file twice, an error will be reported like this : Traceback (most recent call last): E:/BaiduNetdiskWorkspace/01-----------------ModelDevelopment/XAJ/V2/Main.py:28 in a = fread("111.jay") $PY/datatable\utils\fread.py:122 in _resolve_source_any return _resolve_source_file(src, tempfiles) $PY/datatable\utils\fread.py:203 in _resolve_source_file return _resolve_archive(file, None, tempfiles) $PY/datatable\utils\fread.py:326 in _resolve_archive out_result = core.open_jay(filename) (with $PY = C:\Users\lenovo\AppData\Local\Programs\Python\Python36\lib\site-packages)

IOError: Invalid signature for a Jay file: last 4 bytes are \x00\x00\x00\x00

CODE:

url = "https://raw.githubusercontent.com/Rdatatable/data.table/master/vignettes/flights14.csv"
a = fread(url)
a.to_jay("111.jay")
a = fread("111.jay")
a.to_jay("111.jay")
a = fread("111.jay")

Enviroment: datatable version: V1.0.0 python version: 3.6.4 operating system: WIN 11

Please help me, THANKS!

arnocandel commented 1 year ago

Can't overwrite a file that's being read (memory-mapped) by datatable. The suggested work-around is to use unique file names when saving temporary files (and removing them once no longer needed).