khalil-research / PyEPO

A PyTorch-based End-to-End Predict-then-Optimize Library for Linear and Integer Programming
MIT License
429 stars 56 forks source link

Implemented noise contrastive estimation and learning to rank losses #9

Closed senneberden closed 1 year ago

senneberden commented 1 year ago

This contribution adds the noise contrastive estimation and the learning to rank losses for decision-focused learning to PyEPO. Both of these approaches make use of the solution caching scheme that is already implemented in PyEPO.

Concretely, the following changes are included:

LucasBoTang commented 1 year ago

Hi ijskar,

Thank you for your great contribution to the PyEPO project. Your additions of noise contrastive estimation and learning to rank losses are a solid enhancement to our project.

I see you've integrated it well with our existing features, and your explanation is clear. I am enthusiastic about reviewing your pull request. I will do so as promptly as possible and provide feedback or suggestions if necessary.

Thanks again for your time and efforts!

LucasBoTang commented 1 year ago

It was merged on #10 to NEC branch for further review. Thank you @ijskar !

LucasBoTang commented 1 year ago

Hi @ijskar, I have a question about the use of "np.unique" in your code for PairwiseLTR (line 137). I noticed that it's used to sort and remove duplicates. I'm curious about the significance of this action and if we could possibly find the argmin and make comparisons with the rest (with duplication). Could you please shed light on this?

senneberden commented 1 year ago

Hi @LucasBoTang,

Indeed, the "np.unique" removes duplicates and sorts the solution pool. We could indeed also just find the argmin and compare with the rest. That might be more consistent w.r.t. the other learning to rank methods, since duplicates aren't removed in those methods.

However, I think that it might be even better to simply not allow duplicates in the solution pool at all. Not only will this limit the size of the pool, but it also makes more sense in the context of the ranking methods. Duplicate solutions are currently ranked over multiple times, and thus have a larger effect on the ranking losses than unique solutions, but this is not intended. Perhaps whenever a solution is to be added to the pool, we can first check whether it is already present in the pool, and only add it if it isn't. What do you think about this?

LucasBoTang commented 1 year ago

Hi @ijskar ,

Thank you for explaining the use of "np.unique". Indeed, I concur with your viewpoint on keeping the solution pool unique. It makes much sense.

In fact, I had the same ideas before, and I modified the solution pool to only keep unique solutions yesterday (see the commit "New feat: unique solpool"). This adjustment was incorporated into our code base yesterday, which also seems slightly improve the performance.

Also, I have completed the review of your pull request. Therefore, I have now merged your contribution into the main branch of the PyEPO repository. Given that there have been changes in the code, you can also review the modified code when you get a chance.