Closed annalina closed 3 years ago
It seems the file MR_HUSE_2011_v3_3_17.csv does not exist.
This file is created when running the script excel2csv-cli -i EXIOBASE_conversion_software/data/MR_HUSE_2011_v3_3_17.xlsb -o EXIOBASE_conversion_software/data/
What files are currently located in the folder EXIOBASE_conversion_software/data
?
The directory contains these files:
exiobase_classifications_v_3_3_17.xlsx MR_HUSE_2011_v3_3_17_18.nt MR_HUSE_2011_v3_3_17_26.nt MR_HUSE_2011_v3_3_17_5.nt MR_HUSE_2011_v3_3_17_10.nt MR_HUSE_2011_v3_3_17_19.nt MR_HUSE_2011_v3_3_17_27.nt MR_HUSE_2011_v3_3_17_6.nt MR_HUSE_2011_v3_3_17_11.nt MR_HUSE_2011_v3_3_17_1.nt MR_HUSE_2011_v3_3_17_28.nt MR_HUSE_2011_v3_3_17_7.nt MR_HUSE_2011_v3_3_17_12.nt MR_HUSE_2011_v3_3_17_20.nt MR_HUSE_2011_v3_3_17_29.nt MR_HUSE_2011_v3_3_17_8.nt MR_HUSE_2011_v3_3_17_13.nt MR_HUSE_2011_v3_3_17_21.nt MR_HUSE_2011_v3_3_17_2.nt MR_HUSE_2011_v3_3_17_9.nt MR_HUSE_2011_v3_3_17_14.nt MR_HUSE_2011_v3_3_17_22.nt MR_HUSE_2011_v3_3_17_30.nt MR_HUSE_2011_v3_3_17.csv MR_HUSE_2011_v3_3_17_15.nt MR_HUSE_2011_v3_3_17_23.nt MR_HUSE_2011_v3_3_17_31.nt MR_HUSE_2011_v3_3_17_16.nt MR_HUSE_2011_v3_3_17_24.nt MR_HUSE_2011_v3_3_17_3.nt MR_HUSE_2011_v3_3_17_17.nt MR_HUSE_2011_v3_3_17_25.nt MR_HUSE_2011_v3_3_17_4.nt
I've also tried to run the excel2csv file on the huse dataset again, but I get the following error now:
Traceback (most recent call last):
File "/home/annalina/test/bin/excel2csv-cli", line 11, in
however the first time -before running csv2rdf- it was successful:
Parsing file: EXIOBASE_conversion_software/data/MR_HUSE_2011_v3_3_17.xlsb Parsed sheet has size (9892, 7877) Parsed 0 Parsed 50 Parsed 100 Parsed 150 Parsed 200 ... Parsed 9800 Parsed 9850 Saving to EXIOBASE_conversion_software/data/MR_HUSE_2011_v3_3_17.csv
The reason you can't run the excel2csv script, is because the MR_HUSE_2011_v3_3_17.xlsb
file is not in the data folder.
Next steps: 1: From the root folder for all repos, download the data again with these commands:
wget 'https://silo1.sciencedata.dk/themes/deic_theme_oc7/apps/files_sharing/public.php?service=files&t=20ee45e130a37e87c5b19e07b81b61ec&path=%2Fexiobase-3.3.17&files=EXIOBASE_3.3.17_hsut_2011.zip&download&g=' -O exiobase-dataset.zip
unzip exiobase-dataset.zip
rm -rf exiobase-dataset.zip
mv EXIOBASE_3.3.17_hsut_2011/MR_HUSE_2011_v3_3_17.xlsb EXIOBASE-conversion-software/EXIOBASE_conversion_software/data/
2: Enter the EXIOBASE-conversion-software and run the following command:
excel2csv-cli -i EXIOBASE_conversion_software/data/MR_HUSE_2011_v3_3_17.xlsb -o EXIOBASE_conversion_software/data/
This creates the csv file in the data folder, which was missing before.
3: Now you can continue from this command, which extracts rdf data from the csv file: csv2rdf-cli -i EXIOBASE_conversion_software/data/MR_HUSE_2011_v3_3_17.csv -o EXIOBASE_conversion_software/data/ -c HUSE --flowtype input --multifile 100000 --merge True
It greatly helps only running one command at a time, as some commands will interfere with the workflow if the previous does not run correctly.
Thanks, Emil
@IKnowLogic @annalina can we close this issue?
@kuzeko Yes, I will close it
I've got the following problem (notice that the csv2rdf on hsup worked fine):
Parsing file: EXIOBASE_conversion_software/data/MR_HUSE_2011_v3_3_17.csv Traceback (most recent call last): File "/home/annalina/test/bin/csv2rdf-cli", line 11, in
load_entry_point('EXIOBASE-conversion-software==0.5', 'console_scripts', 'csv2rdf-cli')()
File "/home/annalina/test/lib/python3.6/site-packages/EXIOBASE_conversion_software-0.5-py3.6.egg/EXIOBASE_conversion_software/bin/csv2rdf_cli.py", line 57, in main
File "/home/annalina/test/lib/python3.6/site-packages/EXIOBASE_conversion_software-0.5-py3.6.egg/EXIOBASE_conversion_software/init.py", line 22, in conversion
File "/home/annalina/test/lib/python3.6/site-packages/EXIOBASE_conversion_software-0.5-py3.6.egg/EXIOBASE_conversion_software/csv2rdf.py", line 422, in csv2rdf
File "/home/annalina/test/lib/python3.6/site-packages/pandas/io/parsers.py", line 676, in parser_f
return _read(filepath_or_buffer, kwds)
File "/home/annalina/test/lib/python3.6/site-packages/pandas/io/parsers.py", line 448, in _read
parser = TextFileReader(fp_or_buf, kwds)
File "/home/annalina/test/lib/python3.6/site-packages/pandas/io/parsers.py", line 880, in init
self._make_engine(self.engine)
File "/home/annalina/test/lib/python3.6/site-packages/pandas/io/parsers.py", line 1114, in _make_engine
self._engine = CParserWrapper(self.f, self.options)
File "/home/annalina/test/lib/python3.6/site-packages/pandas/io/parsers.py", line 1891, in init
self._reader = parsers.TextReader(src, **kwds)
File "pandas/_libs/parsers.pyx", line 374, in pandas._libs.parsers.TextReader.cinit
File "pandas/_libs/parsers.pyx", line 674, in pandas._libs.parsers.TextReader._setup_parser_source
FileNotFoundError: [Errno 2] File EXIOBASE_conversion_software/data/MR_HUSE_2011_v3_3_17.csv does not exist: 'EXIOBASE_conversion_software/data/MR_HUSE_2011_v3_3_17.csv'
mv: cannot stat ‘EXIOBASE_conversion_software/data/flows_merged.nt’: No such file or directory
gzip: output/exiobase_huse.nt: No such file or directory
any suggestions?