Open nandita0401 opened 3 years ago
Can you try to replace the line csv.field_size_limit(sys.maxsize)
in file src\ocr\datahelpres.py
with following code:
max_int = sys.maxsize
while True:
# decrease the maxInt value by factor 10
# as long as the OverflowError occurs.
try:
csv.field_size_limit(max_int)
break
except OverflowError:
max_int = int(max_int)
It seems that sys.maxsize
behaves differently across platforms and can cause this error. You can also try to replace sys.maxsize
with fixed number (e.g. csv.field_size_limit(2147483647)
, but I am not sure how big the number must be). If the number is too small it will result in error in further loading. Please try it and let me know how it goes.
It is taking too much time for execution. It's been more than 12 hours for execution. Is there any solution?
Oh, that definitely shouldn't take that long (just a few seconds I guess). Did you try setting fixed number like csv.field_size_limit(2147483647)
?
From where to get the dataset?
Well, the steps are bit old and I would like to rework it once I have more time.
You have to download datasets according to the instructions in data/
folder (all datasets aren't necessary).
Then go to src/data/
and run these scripts in following order (some extra parameters might be necessary):
python data_extractor.py
python data_normalization.py
python data_create_sets.py --csv
How to solve this error?
It should work now, just pull latest changes from repo.
Can you please elaborate?
On what exactly?
How to solve this error?
Can you try to replace the line
csv.field_size_limit(sys.maxsize)
in filesrc\ocr\datahelpres.py
with following code:max_int = sys.maxsize while True: # decrease the maxInt value by factor 10 # as long as the OverflowError occurs. try: csv.field_size_limit(max_int) break except OverflowError: max_int = int(max_int)
It seems that
sys.maxsize
behaves differently across platforms and can cause this error. You can also try to replacesys.maxsize
with fixed number (e.g.csv.field_size_limit(2147483647)
, but I am not sure how big the number must be). If the number is too small it will result in error in further loading. Please try it and let me know how it goes.
The error hasnt solved for me too even after replacing the code provided by you
OverflowError: Python int too large to convert to C long Getting this error for both train.csv and dev.csv file What to do to solve this error?