Closed danabrey closed 4 years ago
Thanks! It's currently constructed by doing a few joins - MFL and Sleeper share a fantasydata ID and stats_global_ID, so I merge those two IDs sequentially, and have toyed with name-merges for others. This improves as MFL and Sleeper prune their IDs. I could maintain a supplemental ID csv/database that's joined in after those joins 🤔
Twitter DMs is good by me too, although I can better pretend I'm working when I have GitHub open ;)
I think from a "contributing" standpoint, could make it easier by maintaining a csv on this git, and then accepting PRs to update it. I can download the csv onto the server as part of the script and then do the supplemental join to fill in missing. Not sure that'll fix Herndon, but may be a separate issue. DeAndre Hopkins of all people had a similar issue last year (sigh)
Chatting here it is, then. I've given up pretending to be working since moving to full remote working due to COVID, haha.
Having a CSV on here that I can make PRs to sounds like a good solution. For my app's purposes, I'll need to maintain a list of missing IDs somewhere anyway, so if you let me know what columns that CSV will have, etc, then I can start putting that together.
All the same cols and nomenclature as db_playerids is probably fine for now (just can't be missing mfl_id). I'll need to figure out "new_gsis_ids" at some point because the NFL scrapr/fastr API changed, but I may be joining those in as a separate csv/table.
MFL ids are serving as my primary key right now (would consider changing that laterish but I find it's the best/most-complete/consistent API so am a little anchored), so the only restriction on the csv is that mfl_id must be there and unique (any other fields can be missing).
Thanks for being interested in this, really appreciate it :D I'm masked and working from the office rn haha
Awesome. I think for cleanliness for now, a simple two column CSV, mfl_id
and sleeper_id
, would suffice? I can put that in a fork of this repo and submit a PR, any care for the name? missing_sleeper_ids.csv
?
And my interest is almost entirely self-serving! This data resource is so awesome as a base for so many of my side-projects, I have probably 10 different crazy spreadsheets all using combinations of values, ID merging, etc. and now this larger web app project.
Also, agree that MFL provides the most consistent player list. My data-mining always goes Step 1: import players from MFL. Step 2: merge everything else in. Any player that's missing from MFL's database isn't a real player :)
I've noticed a bunch of PFR mismatches too (that's joined with a name/team/pos merge) so if you just call it "missing_playerids.csv" and leave me a blank col there that'd be helpful :)
@tanho63 - have you considered creating your own internal id for the db_playerids.csv to use as the primary id for this table? In the baseball world, I've used this project https://github.com/chadwickbureau/register to cross-reference player ids at different websites. They created a new id for each player that is guaranteed to be consistent and unique, and it doesn't rely on 3rd-party ids. Probably overkill for this specific issue.
@trojanguard25
is the main reason why I haven't, to be honest. I'd consider it more if I thought there wasn't at least one definitively good/maintained one like MFL or fantasydata or whatever
I'll close off this issue when I get the merge script sorted and it works for the first time :)
I opened a fresh PR with every missing Sleeper ID bar one - #12
Decided to run it early while I was still looking at it, looks good to me :)
Thanks again for reaching out and contributing, much appreciated!
That's awesome news, thanks!
I'm not sure how you collate the player IDs, but I've got some contributions to make if that's possible:
Chris Herndon seems to have duplicate IDs in Sleeper's database: 5009 and 5755 - 5755 is a free agent, 5009 looks to be the real Chris Herndon
Anthony Gordon (MFL ID 14787) missing Sleeper ID: 6898
Marquez Callaway (MFL ID 15034) missing Sleeper ID: 6989
Mike Warren (MFL ID 14816) missing Sleeper ID: 6992
Benny Snell (MFL ID 14072) missing Sleeper ID: 6156
Quartney Davis (MFL ID 14856) missing Sleeper ID: 6879
Salvon Ahmen (MFL ID 14811) missing Sleeper ID: 6918
Jeff Thomas (MFL ID 14866) missing Sleeper ID: 7076
JaMycal Hasty (MFL ID 14821) missing Sleeper ID: 6996
Patrick Taylor (MFL ID 14817) missing Sleeper ID: 6963
Thaddeus Moss (MFL ID 14869) missing Sleeper ID: 6919
I have an app that's using the awesome .csv to help analyse some rosters. Depending on how you create the csv, maybe I could contribute these in a more automated way - or maybe not! Have a chat on Twitter DMs if you want?