Take care of the long tail of attributes in init

timokau commented 4 years ago

Description

After removing some big sets (#154, #155), this PR is supposed to take care of the "long tail" of failures of the check_no_attributes_set_in_init sklearn estimator check.

~This is a work in progress. Builds on #154, #155 which should be reviewed first.~

Also see the commit message:

Most of these attributes are actually initialized during fit and the values do not need to be set in init. To conform the the scikit-learn estimator API, we should not set them in init and postfix values that are learned from the data with an underscore.

This takes care of most of the remaining attributes in init. The remaining attributes are due to construct_model being called in __init__, which cannot currently be easily fixed. A workaround for that will follow.

Motivation and Context

Trying to meet the scikit-learn estimator requirements.

How Has This Been Tested?

Work in progress.

Does this close/impact existing issues?

Related to #94, #116.

Types of changes

[ ] Bug fix (non-breaking change which fixes an issue)
[ ] New feature (non-breaking change which adds functionality)
[x] Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

[x] My code follows the code style of this project.
[ ] My change requires a change to the documentation.
[ ] I have updated the documentation accordingly.
[ ] I have added tests to cover my changes.

timokau commented 4 years ago

Rebased with #155 in master.

timokau commented 4 years ago

This PR is ready for review. There are two unresolved questions that I have commented on inline. After this is done, there are just a handful of "no parameters in init" failures remaining:

E       AssertionError: Estimator <class 'csrank.choicefunction.fate_choice.FATEChoiceFunction'> should not set any attribute apart from parameters during init. Found attributes ['joint_layers', 'kernel_regularizer_', 'optimizer_', 'scorer', 'set_layer'].
E       AssertionError: Estimator <class 'csrank.discretechoice.fate_discrete_choice.FATEDiscreteChoiceFunction'> should not set any attribute apart from parameters during init. Found attributes ['joint_layers', 'kernel_regularizer_', 'optimizer_', 'scorer', 'set_layer'].
E       AssertionError: Estimator <class 'csrank.objectranking.fate_object_ranker.FATEObjectRanker'> should not set any attribute apart from parameters during init. Found attributes ['joint_layers', 'kernel_regularizer_', 'optimizer_', 'scorer', 'set_layer'].

All of these have the same cause: The FATE core is calling _construct_layers in __init__. It is not trivial to work around that right now, since the core's construct_layers has to be called beforefitand there is no easy way to do that except copying the call into all thefit` functions.

I plan to fix that by renaming all our fit functions to _fit and then create a wrapper fit in learner.py that first calls something like _fit_init and then _fit. That way we would be able to inherit pre-fit stateful initialization. The wrapper should pass on the documentation. I will do that in a separate PR, since its largely unrelated to the changes here and I think it is easier to review them separately.

timokau commented 4 years ago

Rebased to test the new CI. Still ready for review / feedback.

timokau commented 4 years ago

The two threads are now resolved. I only changed the way the is_variadic parameter (which is no longer a parameter) is handled. Please have another look, I think this is good to go now.

kiudee / cs-ranking