dynastyprocess / data

An open-data fantasy football repository, maintained by DynastyProcess.com
https://dynastyprocess.com
GNU General Public License v3.0
73 stars 19 forks source link

[Data Bug] #21

Closed damj97 closed 3 years ago

damj97 commented 3 years ago

I found a duplicate gsis_id

Both MFL ids 11724 & 14683 gave a gsis_id of 00-0034641

I think gsis_id of 00-0034641 goes with MFL ID 14683 (Chris Jones)

trojanguard25 commented 3 years ago

I think gsis_id of 00-0034641 goes with MFL ID 14683 (Chris Jones)

This matches what is in nflfastr-roster.

mfld_id 11724 (Christian Jones) should have a gsis_id of 00-0031130 (according to nflfastr-roster).

tanho63 commented 3 years ago

I'm planning a bit of a revamp of this in conjunction with tweaking @trojanguard25's PR #20 - I was previously sourcing gsis_id primarily from the Sleeper API - but with the new nflfastr-roster data repo (new since I wrote code) I can join on the sportradar ID and make sure the gsis-id comes from there instead - that should be more robust

damj97 commented 3 years ago

Great guys, thanks. One thing I've noticed about Sleeper is that their database isn't very clean. I found a bunch of bad IDs which I sent to them. I'm not sure what sites I'll scrape from, so right now I'm just making sure I have as many cross references as possible. On Monday, March 8, 2021, 11:20:08 AM EST, Tan Ho notifications@github.com wrote:

I'm planning a bit of a revamp of this in conjunction with tweaking @trojanguard25's PR #20 - I was previously sourcing gsis_id primarily from the Sleeper API - but with the new nflfastr-roster data repo (new since I wrote code) I can join on the sportradar ID and make sure the gsis-id comes from there instead - that should be more robust

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

tanho63 commented 3 years ago

That problem's cropped up before, I solved it like this: https://github.com/dynastyprocess/data/issues/13

damj97 commented 3 years ago

Thanks, I'll check it out. I started using Excel Power Queries in 2019 which is where I first learned about you guys.  I didn't plan last year ... so I look forward to learning more about scraping from you guys and FF data management. On Monday, March 8, 2021, 11:31:57 AM EST, Tan Ho notifications@github.com wrote:

That problem's cropped up before, I solved it like this: #13

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

tanho63 commented 3 years ago

Hiya! Sorry to leave this languishing for a bit - it should be solved as of the most recent commit. Closing.