Open lidiancracy opened 1 year ago
Hi @lidiancracy , Thank you for your interest in our work and for your fix!
I think there are timeouts in the Java side, and by splitting to batches you avoid them.
Thanks again, we would love to adopt this fix if you sent it as a PR.
Best, Uri
When using the process.sh script, I can process my test and validation datasets normally, but I am unable to process my training dataset without any error output. I then added a batch field in extract.py and changed the directory scanning to batch scanning instead of scanning all at once. After these modifications, I was able to get the expected output for the training data. The modified py file is as follows,we add the parameter "batch_size" and update ExtractFeaturesForDirsList mehtod: