acbull / Unbiased_LambdaMart

Code for WWW'19 "Unbiased LambdaMART: An Unbiased Pairwise Learning-to-Rank Algorithm", which is based on LightGBM
MIT License
225 stars 50 forks source link

Reccommend using a submodule+fork for Unbias_LightGBM #8

Open paddy74 opened 5 years ago

paddy74 commented 5 years ago

Due to Unbias_LightGBM effectively being a fork of LightGBM, it would be sensible to create a fork of LightGBM with the necessary changes and renamed to Unbais_LightGBM and add that fork as a submodule to this project. The new fork would initially be set to the relevant commit of LightGBM that Unbias_LightGBM is based upon.

This would allow updates and bug fixes to LightGBM to be easily incorporated into this project, as well as additional clarity that Unbias_LightGBM is LightGBM with modification.

acbull commented 5 years ago

That's a reasonable suggestion. So how to modify the current repository to be a fork of LightGBM?

paddy74 commented 5 years ago

The current repository would not be a fork. It would contain a submodule (link to another git repository) to the fork.

Before doing any of the following I recommend creating a release. That way your paper is correlated with a specific release (i.e. "This paper is based on v1.0").

Do you recall what LightGBM commit you started at? If so:

Then anytime you clone the current repository you would run git submodule init && git submodule update.

If desired you can then attempt to merge in and test with the latest LightGBM changes. As submodules are created at a commit, you may then update the submodule to the latest commit in the fork with git submodule update --remote --merge and commit and push that.

If you do not recall the exact commit Try and get as close as possible (prefer a later commit) then test to make sure it doesn't affect your results.

upbit commented 5 years ago

I tried to merge Unbiased_LambdaMart into the original and it is now ready to compile: https://github.com/upbit/Unbiased_LambdaMart/commit/0071cf2e5b6a5dd9bd64121215d82ebebe588a36

The submodule method is not easy to track the new version.

paddy74 commented 5 years ago

What is it you mean by:

The submodule method is not easy to track the new version.

?

The main project would reference a specific commit of the submodule (preferably a 'release' commit).

upbit commented 5 years ago

Sorry for my poor english. I mean

It is not easy to upgrade to the latest version of LightGBM (eg: v2.2.4).

If forked from microsoft/LightGBM, we can merge upstream changes at any time, just by initiating an merge request microsoft/LightGBM:master -> acbull:master

paddy74 commented 5 years ago

Yes that would be the idea. acbull:Unbiased_LightGBM would be a fork of microsoft/LightGBM:release (where release is some release commit).

Then acbull:Unbiased_LightGBM:release would be added as a submodule to this project, acbull:Unbiased_LambdaMART.

Even if the decision is not to use the latest LightGBM, it still provides it as a possibility and links the fork acbull:Unbiased_LightGBM to a specific version of LightGBM. This also makes it significantly easier to distinguish what alterations were made to make it "unbiased".

robert-howley-zocdoc commented 4 years ago

An alternative that would greatly increase the impact of this methodology is to add this to lightgbm itself. Fork lightgbm, add your implementation, and open a pr to merge it back into the source repo.

This seems to be a potentially powerful approach to LTR, I suspect the lightgbm maintainers would be appreciative of the contribution.