Closed it176131 closed 3 months ago
Check out this pull request on
See visual diffs & provide feedback on Jupyter Notebooks.
Powered by ReviewNB
Thanks a lot for the PR, really appreciate it and hope to review it in the upcoming days!
Thanks a lot for the PR, really appreciate it and hope to review it in the upcoming days!
Sure thing! I wasn't sure what to put regarding newer version release dates so I bumped the patch number by one and set the release date to "TBD". Lmk if I should change anything.
Hey @rasbt it looks like the pipeline checks keep failing for files unrelated to this PR. Should I log a separate issue and open a new PR to fix those too?
I'd say as long as the tests for the new feature pass then it should be ok. There have been some other tests that have been failing in some submodules for several months due to certain software version updates and minor precision differences I think. I haven't had time to investigate yet.
From a quick look, there seems to be a more major problem though:
AttributeError: module 'pkgutil' has no attribute 'ImpImporter'. Did you mean: 'zipimporter'?
[end of output]
Maybe that's related to Python 3.12. That's definitely something worth fixing so we can find out whether the relevant tests for this PR pass. Maybe doing it in a separate branch first may make sense so the PR doesn't become too cluttered. I also changed the setting here in hope tests will now run automatically each time the PR is updated.
And thanks again for your efforts ... I wish I could be more responsive but it's a busy week
I'd say as long as the tests for the new feature pass then it should be ok. There have been some other tests that have been failing in some submodules for several months due to certain software version updates and minor precision differences I think. I haven't had time to investigate yet.
From a quick look, there seems to be a more major problem though:
AttributeError: module 'pkgutil' has no attribute 'ImpImporter'. Did you mean: 'zipimporter'? [end of output]
Maybe that's related to Python 3.12. That's definitely something worth fixing so we can find out whether the relevant tests for this PR pass. Maybe doing it in a separate branch first may make sense so the PR doesn't become too cluttered. I also changed the setting here in hope tests will now run automatically each time the PR is updated.
And thanks again for your efforts ... I wish I could be more responsive but it's a busy week
I'll do some digging. If I can replicate the issue and come up with a solution I'll submit it in a separate PR.
No worries, it should be fixed now via #1089 (except for the transaction encoder test, but that's something we can address in this PR)
Hm, I think the new failures could be due to the sklearn version bump
No worries, it should be fixed now via #1089 (except for the transaction encoder test, but that's something we can address in this PR)
I noticed that scikit-learn 1.1.3 is being installed in the github workflow. Can we bump it to 1.2.2 as that is required for set_output
to work?
Yes, please feel free to bump it up. We probably need to fix some other places that have not been adjusted for the most recent version though
Running test_inverse_transform
locally, it appears the error is raised when np.array(data_sorted)
is called. This is because the data_sorted
is list of lists where the nested lists may have varying lengths (same for oht.inverse_transform(expect)
). I see two potential ways around this:
assert data_sorted == oht.inverse_transform(expect)
dtype="object"
to the np.array
constructorsBoth result in the test passing. I'm in favor of the first option as it doesn't require any changes to the data_sorted
and the expected output type is a list
rather than an np.ndarray
.
Turns out scikit-learn
version 1.2.2 had a bug in the set_output
API that failed when the input was not a pandas.DataFrame
. See this PR for details. The fix came in scikit-learn
version 1.3.1. Bumping the scikit-learn
version to it fixes the issue. Commit(s) to follow.
Of course bumping the version results in more failed tests 🤦
I'm going to make a separate PR to handle the scikit-learn
version bump and the failed unit tests. THEN maybe this will work. Didn't realize it was going to take so much work 😅
Arg sorry. Yeah, a sklearn bump was overdue but I recently didn't have the time to look into it since I wasn't using the affected features. If this is too much work, don't worry about it, I can understand if you want to drop this. I could revisit the version bump in the upcoming weeks then, address this, and then merge your PR once it's addressed.
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 78.32%. Comparing base (
e82c9c5
) to head (f018a8d
).:exclamation: Current head f018a8d differs from pull request most recent head 44961b5. Consider uploading reports for the commit 44961b5 to get more accurate results
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
@rasbt looks like all checks passed. Ready to merge 😎
Code of Conduct
Description
This defines the :method:
get_feature_names_out
in :class:TransactionEncoder
to expose the :method:set_output.Related issues or pull requests
Fixes #1085
Pull Request Checklist
./docs/sources/CHANGELOG.md
file (if applicable)./mlxtend/*/tests
directories (if applicable)mlxtend/docs/sources/
(if applicable)PYTHONPATH='.' pytest ./mlxtend -sv
and make sure that all unit tests pass (for small modifications, it might be sufficient to only run the specific test file, e.g.,PYTHONPATH='.' pytest ./mlxtend/classifier/tests/test_stacking_cv_classifier.py -sv
)flake8 ./mlxtend