anhaidgroup / py_stringsimjoin

Scalable String Similarity Joins in Python
BSD 3-Clause "New" or "Revised" License
39 stars 17 forks source link

Disk-based edit distance join tests fail in Appveyor #21

Open kvpradap opened 6 years ago

kvpradap commented 6 years ago

The test cases for disk-based edit distance join are failing in Appveyor. There were 45 such failed test cases, and all of them failed with the same error. The error message with the traceback is shown below.

------------------- Traceback and error msg ------------------------ Traceback (most recent call last): File "c:\python34\lib\site-packages\nose\case.py", line 198, in runTest self.test(*self.arg) File "C:\projects\py-stringsimjoin\py_stringsimjoin\tests\test_disk_edit_dist_join.py", line 114, in test_valid_join output_file_path = output_file_path) File "C:\projects\py-stringsimjoin\py_stringsimjoin\join\disk_edit_distance_join.py", line 146, in disk_edit_distance_join temp_dir, output_file_path) File "py_stringsimjoin\join\disk_edit_distance_join_cy.pyx", line 267, in py_stringsimjoin.join.disk_edit_distance_join_cy.disk_edit_distance_join_cy results = Parallel(n_jobs=n_jobs)(delayed(_edit_distance_join_split)( File "c:\python34\lib\site-packages\joblib\parallel.py", line 962, in call self.retrieve() File "c:\python34\lib\site-packages\joblib\parallel.py", line 865, in retrieve self._output.extend(job.get(timeout=self.timeout)) File "c:\python34\lib\site-packages\joblib_parallel_backends.py", line 515, in wrap_future_result return future.result(timeout=timeout) File "c:\python34\lib\site-packages\joblib\externals\loky_base.py", line 431, in result return self.get_result() File "c:\python34\lib\site-packages\joblib\externals\loky_base.py", line 382, in get_result raise self._exception nose.proxy.OSError: [Errno 22] Invalid argument: 'C:\projects\py-stringsimjoin\0_05:11:25:188201.csv' ------------------- Traceback and error msg ------------------------

The detailed error log is attached at: log_appveyor.txt

From the initial analysis, it looks like the error has to do with the intermediate file that is created by the disk-based edit join command to flush the intermediate results to disk. Note that this error occurs only in Appveyor (CI service used for Windows) and not in Travis-CI.