Open KyleKaminky opened 3 months ago
Wow, just got around to reviewing the change. Complete change of submission format.
The file should contain a header and have the following format:
RowId,Tournament,Bracket,Slot,Team 1,M,1,R1W1,W01 2,M,1,R1W8,W08 3,M,1,R1W5,W05 ... Here, the RowId column is a dummy index required by the metric; it should be a simple enumeration of the rows. The Tournament column indicates either the Men's (M) tournament or the Women's (W) tournament. The Bracket column enumerates the brackets in each tournament, starting from 1; you should use a unique enumeration for each tournament. The Team column should contain the team you predict to win in that respective Slot.
So updated design of this should:
New unit testing may require some new fake data (Kaggle owns this dataset).
Need to update dependencies and build process. Way out of date.
Pseudo code
Two externally accessible functions: Build bracket Build brackets
Build brackets just calls build bracket multiple times depending on args, passing only relevant data to build bracket. Load submission file Filter to only requested rows, iterate if multiple submissions. Output dir param
Build bracket Output file param Bracket choices come from submission df (param) Map from slot data to team name for image (test with slot data in image first?) Remove probability comparison Determine winner based on slot data instead
The kaggle competition has changed their submission format, I'll try to start working on a PR for this