Closed gheemony closed 3 years ago
Responses to the main question is below. Feel free to followup with anything or close the issue, otherwise I will close this tomorrow.
Note that the 538 and kenpom distributions are not imported into the package for 2021 at this point. I'd be open to a PR that grabs the 538 data (https://projects.fivethirtyeight.com/2021-march-madness-predictions/) and formats it into the 64x7 and maps the teams in teams.men
, but I don't currently plan to do this myself. I remember it being somewhat of a hassle last time I did it to make that transformation. I don't see kenpom probs out at this point. The population picks are in the package for 2021 now, though, with the completion of #28.
sim.bracket
accepts either a prob.matrix
argument, as in the vignette, or a prob.source = c("pop", "kenpom", "538")
argument, which accepts that 64x7 format by pulling from the package's data folder. Those route to sim.bracket.matrix
and sim.bracket.source
respectively to handle the different data structures.draw.bracket
is independent of this, it accepts bracket.empty
(the seeds, now available in the package) and bracket.filled
no matter how the filled out bracket is generatedtest.bracket
also accepts both prob.matrix
and prob.source = c("pop", "kenpom", "538")
, which is handled in similar ways. It also accepts pool.source
which determines how to sample the brackets picked by others in the poolI've taken the current 538 data and put it into the format for previous prediction files. See attached.
The file/object is in my Global Environment. But when I run find.bracket, I get an error: "Error: "pred.538.men.2021" is not an exported object from 'namespace:mRchmadness". I can hunt around the StackExchange and other sources for workarounds but thought you might be able to find an easier way to fix this problem. Possibly altering the code so that the Global Environment is checked for data since it appears that update data won't be provided every year. Thanks! pred.538.men.2021.zip
Good work! I will re-open and convert this over to a data update issue, and try to tackle it by the end of the day. The package is looking for pred.538.men.2021
within the package's namespace. You could have it look in the global environment by removing the mRchmadness::
here:
That said, this will only work if the teams are mapped in teams.men$name.538
to associate the team names 538 uses with the ESPN IDs that drive the rest of the analysis, which takes a bit of effort. I will do this by the end of the day if you have not.
Updated teams.men file with 538 team names appended. teams.men.update.zip
I'm unable to fix the issue by makingthe one change to sim.bracket.source.R. I can't find the file in the namespace or in the package directory. I download the raw file from GitHub, make the change, and run it so that modified version is in the Global Environment, but it doesn't prevent the error. I'm hoping you can make the fix relatively soon so that brackets can be prepared tonight. Thanks.
I assume this won't get done tonight?
It should within 20 min or so
Mention it here if there are any issues after re-installing the package. I didn't test it much, but did make sure the team names are all mapped so I think it should work.
[1] "2378" "2306" "12" "2084" "96" "2166" "2132" "2247" "2440" "2750" "2390" "2752" "2640" "2571" "288" "245" "2507" "149" [19] "2515" "276" "93" "2239" "2086" "2617" "219" "2550" "152" "2" "232" "166" "150" "227" "2083" "2628" Error in sim.bracket.source(prob.source = prob.source, league = league, : No predictions from source for above teams. Is year correct?
Checking now to see why teams are missing.
The teams.men object is incorrect. It is missing IDs that are in the bracket.men.2021. I matched names from 538 to teams.men, so that is why the update is still incorrect. I will look into matching to bracket.men.2021.
I used the 538 CSV you sent over but ended up doing the mappings myself in the way I normally do because it was easier to apply for me.
I don't follow that second part though, all the teams in bracket.men.2021
map to teams in teams.men
:
> mRchmadness::bracket.men.2021 %in% mRchmadness::teams.men$id
[1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
[17] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
[33] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE FALSE TRUE TRUE FALSE TRUE TRUE TRUE TRUE
[49] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE FALSE TRUE TRUE FALSE
> mRchmadness::bracket.men.2021[!mRchmadness::bracket.men.2021 %in% mRchmadness::teams.men$id]
[1] "2724/2181" "127/26" "116/2640" "2450/2026"
> c(2724, 2181, 127, 26, 116, 2640, 2450, 2026) %in% mRchmadness::teams.men$id
[1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
Ok, then why do I get the error that there are no predictions from the source?
Looking through some of the IDs that are printed out in the error there appear to be teams that are not in the 2021 bracket. UMBC is the first team ID, for example, which suggests somewhere along the way you are using 2018 data (perhaps from a default arg), as the error message suggests. Did you specify year = 2021
in the call to sim.bracket
?
Also, worth mentioning there's a working version (not supporting 538 probs, using the population pick probs for the distribution of pool picks and a Bradley-Terry model for the game probs) for 2021 at https://saberpowers.shinyapps.io/mRchmadness/
I had an incorrect reference to 2018. But now get an error for the references to the first four games:
[1] "2450/2026" "2724/2181" "116/2640" "127/26" Error in sim.bracket.source(prob.source = prob.source, league = league, : No predictions from source for above teams. Is year correct?
I appreciate you hanging in there with me. I think this is the last thing to sort out.
I did notice the CSV you sent for the 538 probs only had 64 rows instead of the 68 I expected. Did the first four teams in there represent the prob for both of them? Assuming so, I think I know what to... give me a few min
I did notice the CSV you sent for the 538 probs only had 64 rows instead of the 68 I expected. Did the first four teams in there represent the prob for both of them? Assuming so, I think I know what to... give me a few min
Yes, I combined their probabilities because I thought it was necessary. Sorry. Now option to take latest 538 probs with just the 4 winners or take the old probs with all 68 teams.
There's a chance it works with the commit above but no promises. This is what I did to generate the change in data, for reference.
teams.men$name.538 = as.character(teams.men$name.538)
teams.men[teams.men$id == 2026, 'name.538'] = 'Appalachian St'
teams.men[teams.men$id == 2181, 'name.538'] = 'Drake'
teams.men[teams.men$id == 2450, 'name.538'] = 'Norfolk State'
teams.men$name.538 = as.factor(teams.men$name.538)
save(teams.men, file='.../mRchmadness/data/teams.men.RData')
pred.538.men.2021$name = as.character(pred.538.men.2021$name)
pred.538.men.2021[pred.538.men.2021$name == 'Wichita State', 'name'] = 'Wichita State/Drake'
pred.538.men.2021[pred.538.men.2021$name == 'Michigan State', 'name'] = 'Michigan State/UCLA'
pred.538.men.2021[pred.538.men.2021$name == "Mount St. Mary's", 'name'] = "Mt St Mary's/Texas Southern"
pred.538.men.2021[pred.538.men.2021$name == 'Appalachian State', 'name'] = 'Norfolk State/Appalachian St'
pred.538.men.2021$name = as.factor(pred.538.men.2021$name)
save(pred.538.men.2021, file='.../mRchmadness/data/pred.538.men.2021.RData')
All but one worked: "116/2640" Could it be that you're just missing the removal for Michigan State?
I pulled the wrong name for Mt. Saint Mary's in the snippet above. I have a good feeling about it this time.
pred.538.men.2021$name = as.character(pred.538.men.2021$name)
pred.538.men.2021[pred.538.men.2021$name == "Mt St Mary's/Texas Southern", 'name'] = "Mount St. Mary's/Texas Southern"
pred.538.men.2021$name = as.factor(pred.538.men.2021$name)
save(pred.538.men.2021, file='.../mRchmadness/data/pred.538.men.2021.RData')
Thank you so much. To show my appreciation, I'd love to help with the project going forward, which we can discuss later. I understand if your answer is "thanks, but no thanks." Good luck with your picks.
I apologize if this is addressed in the documentation, but I have not been able to find an answer to this in the Vignette or elsewhere: How do you simulate a tournament and find and test brackets using 538 or other predictions?
The Vignette uses the Bradley-Terry example which generates a prob.matrix that is one row for each team and one column for each team (64x64).
However, the past 538 predictions that are provided are 64x7: a row for each team and a column for win probability in each round. Is there a function to convert this to a prob.matrix that can be used with sim.bracket, draw.bracket, and test.bracket?