Open svetozarstojkovic opened 5 years ago
Other problem occurred when using dump_to_path, it gives blank lines between each row. Possible solution: https://stackoverflow.com/questions/3348460/csv-file-written-with-python-has-blank-lines-between-each-row
Blank lines between the rows have been fixed by doing next steps:
This fix should be checked Linux and Mac how it behaves.
As for the bug - can you add a print statement before that line to see what is the filename that it's attempting to open (i.e. print(self.tmpfile.name)
?
Running into the same issue in windows. Can run the same code in Ububtu 18.04 on WSL. Here is the trace for the break.
Traceback (most recent call last):
File ".\flows\subdivision_endonyms.py", line 85, in <module>
subdivision_endonyms_cldr.process()
File "C:\Users\lenovo\AppData\Local\Programs\Python\Python37-32\lib\site-packages\dataflows\base\flow.py", line 15, in process
return self._chain().process()
File "C:\Users\lenovo\AppData\Local\Programs\Python\Python37-32\lib\site-packages\dataflows\base\datastream_processor.py", line 86, in process
collections.deque(res, maxlen=0)
File "C:\Users\lenovo\AppData\Local\Programs\Python\Python37-32\lib\site-packages\dataflows\processors\dumpers\dumper_base.py", line 69, in row_counter
for row in iterator:
File "C:\Users\lenovo\AppData\Local\Programs\Python\Python37-32\lib\site-packages\dataflows\processors\dumpers\file_dumper.py", line 76, in rows_processor
for row in resource:
File "C:\Users\lenovo\AppData\Local\Programs\Python\Python37-32\lib\site-packages\dataflows\base\schema_validator.py", line 46, in schema_validator
for i, row in enumerate(iterator):
File "C:\Users\lenovo\AppData\Local\Programs\Python\Python37-32\lib\site-packages\dataflows\helpers\rows_processor.py", line 11, in process_resource
yield from self.func(resource)
File "C:\Users\lenovo\AppData\Local\Programs\Python\Python37-32\lib\site-packages\dataflows\processors\printer.py", line 61, in func
for i, row in enumerate(rows):
File "C:\Users\lenovo\AppData\Local\Programs\Python\Python37-32\lib\site-packages\dataflows\processors\validate.py", line 55, in process_resource
yield from self.validator(res)
File "C:\Users\lenovo\AppData\Local\Programs\Python\Python37-32\lib\site-packages\dataflows\processors\validate.py", line 50, in func
yield from schema_validator(res.res, res, on_error=self.on_error)
File "C:\Users\lenovo\AppData\Local\Programs\Python\Python37-32\lib\site-packages\dataflows\base\schema_validator.py", line 46, in schema_validator
for i, row in enumerate(iterator):
File "C:\Users\lenovo\AppData\Local\Programs\Python\Python37-32\lib\site-packages\dataflows\processors\sort_rows.py", line 32, in _sorter
db = KVFile()
File "C:\Users\lenovo\AppData\Local\Programs\Python\Python37-32\lib\site-packages\kvfile\kvfile.py", line 19, in __init__
self.db = DB_ENGINE.connect(self.tmpfile.name)
sqlite3.OperationalError: unable to open database file
@akariv Printed out the tmpfile and got C:\Users\lenovo\AppData\Local\Temp\tmpnz1gwmsa
Temporary fix by @paulmz1 in https://github.com/datasets/covid-19/issues/77 was to get it running off the memory instead of the temporary file by editing the kvfile.py
. Seems like an issue with kvfiles.
@nirabpudasaini - Thanks for the info! Just to verify - do you have write access to that directory? Does it exist?
@nirabpudasaini - please pip install kvfile==0.0.8
and let me know if that works
As a developer who is using Windows operating system, I want to use dataflows for data wrangling so that I can datapackage data the easier way.
Example error when running https://gitlab.com/datopian/datasets-fedex/blob/master/flows/country_currencies.py :