JosePauloSavioli / Lavoisier

Format converter for LCI datasets
GNU General Public License v3.0
3 stars 0 forks source link

Validation of conversion results of ILCD and EcoSpold 2 #3

Open JosePauloSavioli opened 1 year ago

JosePauloSavioli commented 1 year ago

Lavoisier validation/test of conversion results is done by comparing them semi-manually with the original dataset (through comparing original and converted parsed field information). This is done mainly due to the lack of software that can import both ILCD .zip and EcoSpold 2 .spold files.

Although this semi-manual validation is effective in pointing out implementation problems and can ensure to some degree that the converted results have the same meaning as the original information, there is still a lack of a more robust validation step that includes the usability of converted data in software, enhancing security that the converted data has similar behavior compared to the original and generates similar results (considering conversions are dependent on external factors such as the elementary flow mapping that can affect flow conversions).

Brightway is in the early stages of implementation of an ILCD input module, which makes it a candidate to make this validation as EcoSpold 2 is already implemented. But in the meantime, is there any option for software that accepts input from both EcoSpold 2 and ILCD alike?

cmutel commented 11 months ago

@JosePauloSavioli I will try to get the various partial ILCD importers into a usable state for Brightway 😺

I also think you could provide fixture-based tests which would compare a complete dataset before and after conversion to a known good state. The current tests are good, but I have found comparing to normalized XML files can expose some interesting bugs, and help make future development or refactoring possible with confidence.

JosePauloSavioli commented 7 months ago

Nice @cmutel!

Do you know if the importers will use ILCD elementary flow data or if they will work with some conversion process to ecoSpold2 elementary flow data? I know it is really difficult to link processes from ILCD into a ecoSpold2 database, but it would be nice to have at least the possibility of importing ILCD data without any internal modifications for it to behave within an specific database.

You were not the first that commented on the test side with fixtures and it is really something I should look upon in next versions. I do locally a couple hundred conversions for each format to test the code, even having a few with their expected output to compare, but nothing like a structured fixture test.