Closed korsbakken closed 2 months ago
Thanks!
One suggestion for adding a test: add a few more keyword arguments, like header, usecols, skiprows, nrows, then make a copy of https://github.com/IAMconsortium/pyam/blob/bc56c53f5a835b857aadab21e4a3f8cb74d9dab6/tests/test_io.py#L116 with
import_df = IamDataFrame(TEST_DATA_DIR / "test_df.xls", nrows=2)
assert_iamframe_equal(test_df_year.filter(scenario="scen_a"), import_df)
Attention: Patch coverage is 96.29630%
with 1 line
in your changes missing coverage. Please review.
Project coverage is 95.0%. Comparing base (
ddbb88e
) to head (6a300c2
). Report is 32 commits behind head on main.
Files | Patch % | Lines |
---|---|---|
pyam/utils.py | 91.6% | 1 Missing :warning: |
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
I guess that the pytest-legacy test (with pandas 2.1.2) is failing because engine_kwargs={"data_only": False}
is not supported there? Can you add a pytest-skip for versions lower than 2.2 for that test? Then this PR is good to be merged!
I guess that the pytest-legacy test (with pandas 2.1.2) is failing because
engine_kwargs={"data_only": False}
is not supported there? Can you add a pytest-skip for versions lower than 2.2 for that test? Then this PR is good to be merged!
Yes. In fact, both the added tests fail with pandas < 2.2 for different reasons. test_read_xlsx_calamine
fails because calamine wasn't added as an engine before 2.2.0. And test_read_xlsx_kwargs
fails because pandas included hard-coded engine_kwargs
that couldn't be overridden until the same version. I will add a warning about the latter if anyone tries to use engine_kwargs
with pandas < 2.2 in the main code, and skip the tests in that case.
I pushed a new version now, with a warning if anyone tries to use engine_kwargs
with pandas version lower than 2.2.0, and skipping the tests if pandas is lower than that version.
In order to actually test the new functionality in CI, I also added a new optional dependency group calamine
to pyproject.toml and updated the pytest workflows to include it. In order to make that work, I also had to update poetry.lock. If you would rather not include that, you can either reject the last two commits if that's possible, or just let me know and I will push a revert commit.
Detects kwargs passed to
IamDataFrame.__init__
and toread_pandas
that need to be passed topandas.ExcelFile.__init__
to be handled properly.This PR closes #876
Please confirm that this PR has done the following:
[ ] Tests Added[ ] Documentation Added[ ] Name of contributors Added to AUTHORS.rstTests: I have run existing tests, and tested locally that the new code reads Excel files correctly with
engine="calamine"
as a kwarg topyam.IamDataFrame
. But a new test under/tests/
that tests the added/fixed functionality (rather than just regression testing) would require that probably aren't intended to be part of the core requirements. If the lead developers would like me to do this, I would appreciate guidance on how to add a test that requires extra optional packages, without breaking the existing test suite.Documentation: As far as I can tell, this is a bug fix, and shouldn't require new documentation. The one new helper function,
utils.get_excel_file_with_kwargs
, has a docstring that explains how it's used.Name in AUTHORS.rst: This is a fairly trivial addition, and I don't need my name in the AUTHORS file for it. But I can add it in a update commit if you do want it there.
Description of PR
The PR modifies lines in
pyam.core.IamDataFrame.__init__
andpyam.utils.read_pandas
that create apandas.ExcelFile
instance to read an Excel file. The modifications ensure that keyword arguments that are recognized bypandas.ExcelFile.__init__
are passed on to thepandas.ExcelFile
instance and removed from calls to later methods (which would cause, e.g.,pandas.read_excel
to raise an exception).The modifications add a new helper function
get_excel_file_with_kwargs
topyam.utils
. The function takes an excel file path and arbitrary kwargs, extracts the keyword arguments that are accepted bypandas.ExcelFile.__init__
, creates a newExcelFile
instance with those keyword arguments, and returns it along with a dict of the other keyword arguments that were not used.The new functionality makes it possible to use the keyword arguments
engine
,storage_options
andengine_kwargs
withIamDataFrame.__init__
andutils.read_pandas
when reading Excel files, which could cause an exception to be raised in the previous version.