Open Mizryo opened 1 month ago
The problem is important. It happens because two datasets are structured differently. Fixing it is not too difficult. In the file multi_process_csv.py, changing the code at line 95 from "print("nofind this file", row[0])" and at line 109 from "print("nofind this file", row[1])" to "pass" might solve the problem.
Dear all,
Thank you for huge effort towards this project!
I have a question about the implementation of the pretrain stage for Deepjoin. In
multi_process_csv.py
, the functionprocess_before_train
callsprocess_task4
which looks for tables of opendata regardless of what file you pass (eithersato_opendata_new.csv
orsato_webtable_new.csv
) for the--tain_csv_file
argument ofdeepjoin_train.py
.Could you teach me how to fix this issue if the functions are defined as expected, and if not, I would really appreciate if you could rewrite them to work for the webtable dataset.
Thank you in advance and best regards, Ryosuke