Documentation insufficient for initializing `WeakSupRank`

SprocketLab / WS-Structured-Prediction

2 stars 1 forks source link

Documentation insufficient for initializing `WeakSupRank` #1

Open scottfleming opened 1 year ago

scottfleming commented 1 year ago

There is no docstring for the constructor of WeakSupRank and not enough info in the paper itself to know what this should be without digging through the codebase. Would be helpful to have a docstring describing how to initialize r_utils for various tasks (eg Ranking, Regression) with some examples so that users can just use eg WeakSupRank out-of-the-box.

https://github.com/SprocketLab/WS-Structured-Prediction/blob/39d7da86bf043e88e9eeb62a4fa6fb926a55efa7/code/core/ws_ranking.py#L13C31-L13C31

scottfleming commented 1 year ago

If I understand correctly, RankingUtils takes in a single argument d which is the cardinality of the ranking, but this is inferred anyway in the train function here: https://github.com/SprocketLab/WS-Structured-Prediction/blob/39d7da86bf043e88e9eeb62a4fa6fb926a55efa7/code/core/ws_ranking.py#L34C11-L34C25. So why not just initialize the WeakSupRank object with L, infer d, and then create the RankingUtils as needed within the constructor?

scottfleming commented 1 year ago

What I want is to be able to just pass L (also needs a better description in WeakSupRank, b/c current docstring doesn't describe what the list of lists should contain -- presumably Ranking objects?) into a WeakSupRank constructor (let's call it wsr), run wsr.train and wsr.infer_ranking to retrieve what the weakly supervised labels should be.

scottfleming commented 1 year ago

And on this note, what would be ideal is if one could pass in L as eg just a list of lists or a numpy array without having to worry as a user about the Ranking object abstraction. So just define L as an (n, m, d) numpy array (or better yet as an (m, n, d) array because it's easier to back out an array of rankings on a per-LF basis rather than the other way around) and then do all the Ranking business in the internals if you really need the underlying abstraction.