Allow processing of duplicated attribute data columns

etsap-TIMES / xl2times

Open source tool to convert TIMES models specified in Excel

https://xl2times.readthedocs.io/

MIT License

12 stars 7 forks source link

Allow processing of duplicated attribute data columns #203

Closed olejandro closed 8 months ago

olejandro commented 8 months ago

Implications of duplicate labels for operations in pandas are described here: https://pandas.pydata.org/docs/user_guide/duplicates.html#consequences-of-duplicate-labels

This PR ensures the data in a duplicated column is returned only once (and not e.g. twice if 2 columns with the same name are present) during processing.

olejandro commented 8 months ago

I've udpated the benchmarks to cover a case like this:

PROCESS	NCAP_START	START	NCAP_START
PRC1	2012		2020
PRC2	2012	2016
PRC3		2016	2020

Currently the tests are failing on Demos7 due to additional records generated by an alias. The records appear in the correct order, so the input for GAMS will be correct.

olejandro commented 8 months ago

Now that the additional records are gone the tests seem to be failing due to failing updated Demo 7 and Demo 7r benchmarks when run on main. This makes sense because main cannot handle the type of duplicate columns introduced in Demo 7. @siddharth-krishna I guess we'll have to override the check?

olejandro commented 8 months ago

Thanks guys! Will revert the change to benchmarks version and add it together with a unit test in my next PR.