data4development / iati-workbench

Create IATI data files from a mix of spreadsheets and IATI input files.
https://developer.data4development.nl/iati-workbench/
GNU Affero General Public License v3.0
0 stars 1 forks source link

Refactor spreadsheet handling: remove step of CSV+CSVXML #37

Open rolfkleef opened 4 years ago

rolfkleef commented 4 years ago

Currently the spreadsheets2iati workflow includes a conversion to CSV files using LibreOffice, and a BaseX-dependent step to convert that CSV into XML. This XML file is then processed.

Using a different LibreOffice conversion, the data can be stored in flat XML Open Document files. Each sheet in the file is available as table:table in that file.

It would be possible to write an XSLT to transform those tables in the same "CSVXML" files currently used. Use the first row in each table to determine the column names, and all others to generate XML similar to what BaseX produces.

A next step may be to work directly with those tables (reorganising how the recognition of the header in a table is done).