Closed wfjvdham closed 3 months ago
Hi Wim,
Thank you for your clear report and repro case. We will investigate.
Dear Wim @wfjvdham ,
Thank you for reporting the issue.
This is an known issue for us. The issue lies more on the Vision side as we assume the users would provide input up to specs as per Pandas dictation. Since the excel handling within PGM-io was built on top of Pandas, the solution for now is to not include the extra information in an additional P field when exporting from Vision. We will make a more permanent solution in the coming sprints based on priorities.
I can also make the solution, what would be the right way to do it?
P
and P_1
table_data[field] *= multiplier
by table_data[(field, unit)] *= multiplier
what would be the right way to do it?
* Rename the columns internally to `P` and `P_1` * Change `table_data[field] *= multiplier` by `table_data[(field, unit)] *= multiplier`
it is on our agenda for the next refinement this afternoon. If we get to a design, we will post it here.
I can also make the solution
After that, whoever wants to pick it up can do that (based on priorities and availability between team AKI and team DGC). It's good to know that that's an option as well!
@wfjvdham the conclusion of the refinement is that there's some unknowns surrounding the (un)stability of the Vision Excel export. Next sprint, we will first do some more investigation to this before we finalize a design. Implementation will probably come the sprint after that, as agreed with Jaap.
@wfjvdham Please find explicit addressing of this issue in the documentation as well. https://github.com/PowerGridModel/power-grid-model-io/pull/251 N.B., the solution offered in the documentation is based on the user feedback from GOAT, who up until this point has experienced no problems with this solution.
Ok thanks 👍
Hi @Jerry-Jinfeng-Guo @petersalemink95 , do you have any updates about this bug?
I read #251 and it does not seem to provide a solution for us. Our team automates netchecks for K&O, and many usefull/critical information is stored in the "specifics" / "Bijzonderheden" tabs in Vision. For example MSR ID's, HLD ID's and "special contract types".
Wim has made a suggestion above how to solve this bug. Are there reasons not to implement this solution? If develop-capacity is an issue, we can also create a PR so that you only need to review. We can also create a feature branch to test the changes on all Vision files of all regions before merging to main. I prefer working (on the long term) from the main branch of PGM io for AKI, instead of using a feature branch ...
Hi @Jaap-van-Wijck , thank you for letting us know that disabling "specifics" is not enough for you. Internally renaming P
and P_1
is probably not the best way to go given the unstable nature of Vision. On top of this, not all teams depend on the "specifics" at the same level, the introduced logic can be a potential source of bug.
I would like to ask you the following questions:
terms_changed
input parameter that could be used (to change column names) when instancing the VisionExcel converter, have you tried that?*_load
s, you could find P
under multiply
field. Have you tried this? If above is not up to your expectation, please let us know and we will bring that up during the team meetings.
@Jerry-Jinfeng-Guo
terms_changed
is not an option I think, because there are two columns with the same name so how can I rename only 1 of them?Before I suggested 2 options, 1 was renaming one of the columns. If that is not the best way to go what about the other option I suggested:
table_data[field] *= multiplier
by table_data[(field, unit)] *= multiplier
There nothing changes to the names of the columns
Hi @wfjvdham @Jaap-van-Wijck , please see: https://github.com/PowerGridModel/power-grid-model-io/pull/261
Describe the bug
Currently columns are checked for duplicated and renamed if the combination of column name and unit is the same (in
ExcelFileStore._group_columns_by_index()
). When only the column name is the same but the unit is different the columns are read as is. This can cause a problem later when usingTabularData()._apply_unit_conversion()
To Reproduce
If for example an excel sheet contains two columns:
Those are not renamed because they are not identified as being a duplicate column. But here:
https://github.com/PowerGridModel/power-grid-model-io/blob/3d8eb4c1abe5fabcaa678238bd0a0655cba86684/src/power_grid_model_io/data_types/tabular_data.py#L146
The value of
field
is just P so the content of both columns is returned giving an error because string values can not be converted to W.Expected behavior
The problem can be solved by changing
table_data[field] *= multiplier
bytable_data[(field, unit)] *= multiplier
but I am not sure if two columns with the same name should exists (even though they have different units). Maybe better would be to do the grouping on only the first value of the index in theExcelFileStore._group_columns_by_index()
function.