Closed kynk94 closed 2 years ago
Hi! did the unit tests passed ?
We had issues with the MacOS port as the pread
seem to have different signature there
I just did unit test, and it failed. Above test means build and import ffcv components.
pytest give error when iterating Loader
in tests\test_array_field.py
The error was occurred when call numba jit compiled code to get results.
https://github.com/libffcv/ffcv/blob/main/ffcv/loader/epoch_iterator.py#L134
The code is CPUDispatcher(<function stage_0 at 0x0000023F941B43A0>)
instance
The error is below, but there is no detailed traceback. So there is a problem with debugging and need more time.
Windows fatal exception: access violation
Current thread 0x00003cdc (most recent call first):
File "c:\users\user\miniconda3\lib\site-packages\ffcv\loader\epoch_iterator.py", line 134 in run_pipeline
File "c:\users\user\miniconda3\lib\site-packages\ffcv\loader\epoch_iterator.py", line 80 in run
File "c:\users\user\miniconda3\lib\threading.py", line 932 in _bootstrap_inner
File "c:\users\user\miniconda3\lib\threading.py", line 890 in _bootstrap
My guess is that its might be related to pread
but I don't know how else I can help you :/
There seems to be a problem with pread
too, but there is another problem.
I configured the access violation
error.
shape of the source and destination were different when mem_cpy
was running.
Before mem_cpy
, there is a part to find the address in mem_state in the mem_read
operation.
However, mem_state had empty ptr and size.
I fixed this bug and some unit test passed, but other ffcv/fields all have similar bugs. After I fix these, I'll comment this PR.
All bugs resolved, and passed all unit test.
I skipped test_memory_leak.py
because DatasetWriter does not have write_pytorch_dataset
method .
And in test_rrc.py
, when mode is 'jpg', assert is not always True, so I compared it with the results on linux and it seemed to match.
Since all tests passed, cdll.msvcrt._read
behaves exactly the same as pread
.
bdeb248 This commit was added for the following reasons:
# On Windows
allocation_table = np.concatenate([
np.array(x).view(ALLOC_TABLE_TYPE) for x in allocations if len(x)
])
# throws ValueError: When changing to a larger dtype, its size must be a divisor of the total size in bytes of the last axis of the array.
allocation_table = np.concatenate([
np.array(x, dtype=ALLOC_TABLE_TYPE) for x in allocations if len(x)
])
# throws ValueError: zero-dimensional arrays cannot be concatenated
allocation_table = np.array([
np.array(x, dtype=ALLOC_TABLE_TYPE) for x in allocations if len(x)
])
# does not throw error.
And there is one more minor issue. If the size of the written dataset by DatasetWriter is less than 16MB, it will take a very long time. I don't know why.
@kynk94 Do you think this could be merged into v1.0 ? Would you be willing to test it when I make a release candidate ?
(Thanks for the work !)
yes it could be merged if there is no problem on linux with this PR's logic. And I can test it with my windows desktop because the test process is simple.
Looks very good to me. I'm merging into v1.0.0. Thank you so much @kynk94 I hope it's going to be useful to many people :smiley:
Resolves #63
Add package finding process in
setup.py
. And modifyingffcv/libffcv.py, libffcv/libffcv.cpp
to import ffcv without error.My 2 windows pc works without error, but I think need others test. And I didn't test in linux.