mattabullock / Baseball-GDT-Bot

A Reddit bot that will generate, post, and keep baseball game discussion threads updated with live stats and scores.
13 stars 14 forks source link

New GD2 format #41

Closed john-b-edwards closed 6 years ago

john-b-edwards commented 6 years ago

I realized in setting Metsbot up for the season that the GD2 url cannot use a trailing slash. I.e. http://gd2.mlb.com/components/game/mlb/year_2018/month_02/day_23/ does not work, but http://gd2.mlb.com/components/game/mlb/year_2018/month_02/day_23 does. Weirdly enough, the trailing slash is still required for other elements of the database: i.e. http://gd2.mlb.com/components/game/mlb/year_2018/month_02/day_22/gid_2018_02_22_bocbbc_bosmlb_2/ works, but http://gd2.mlb.com/components/game/mlb/year_2018/month_02/day_22/gid_2018_02_22_bocbbc_bosmlb_2 does not. I'm trying to get a fix up for myself but this seems like an important but relatively minor issue.

avery-crudeman commented 6 years ago

Weird. I just noticed this myself. I wonder what the cause is.

On Feb 22, 2018 10:48 PM, "Metlover" notifications@github.com wrote:

I realized in setting Metsbot up for the season that the GD2 url cannot use a trailing slash. I.e. http://gd2.mlb.com/components/ game/mlb/year_2018/month_02/day_23/ does not work, but http://gd2.mlb.com/components/game/mlb/year_2018/month_02/day_23 does. Weirdly enough, the trailing slash is still required for other elements of the database: i.e. http://gd2.mlb.com/components/ game/mlb/year_2018/month_02/day_22/gid_2018_02_22_bocbbc_bosmlb_2/ works, but http://gd2.mlb.com/components/game/mlb/year_2018/month_02/ day_22/gid_2018_02_22_bocbbc_bosmlb_2 does not. I'm trying to get a fix up for myself but this seems like an important but relatively minor issue.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/mattabullock/Baseball-GDT-Bot/issues/41, or mute the thread https://github.com/notifications/unsubscribe-auth/AEiu6FA1yy675a1wrfV0rTLd5TCSFxweks5tXjTygaJpZM4SQWMn .

john-b-edwards commented 6 years ago

Some of the saber community noticed a couple weeks ago. Really frustrating because it breaks a lot of scrapers. https://www.reddit.com/r/Sabermetrics/comments/7v07ax/mlb_gameday_data_sites_down_went_private/

avery-crudeman commented 6 years ago

Hurm. I've removed the trailing slash, but this section of main.py:

for v in html:  
    if self.TEAM_CODE in v:
    v = v[v.index("\"") + 1:len(v)]
    v = v[0:v.index("\"")]  
    directories.append(url + v)

appends "-//W3C//DTD HTML 3.2 Final//EN" instead of the gid for the game.

toddrob99 commented 6 years ago

Feel free to delete this comment if it's not appropriate (or let me know and I will), but I've got this working on my fork: https://github.com/toddrob99/Baseball-GDT-Bot. My fork uses grid.json to find the games, rather than reading the lines from the directory listing.

toddrob99 commented 6 years ago

If you remove the trailing slash from the url var, you have to include it anywhere you concatenate anything on the end. For example directories.append(url + "/" + v). I don't remember if the url appears in editor.py anywhere in this fork, but if so, it needs to be updated there too.

toddrob99 commented 6 years ago

I think you'll also find a new breakage where editor.py tries to download gamecenter.xml before it's posted. It used to fail gracefully, but now the gd2 server throws a new error. This impacted my fork, but I may have been downloading it differently.

Here's the commit where I fixed both of these issues (among other things, and there was one more commit after this fixing a couple other things that probably don't apply to this fork): https://github.com/toddrob99/Baseball-GDT-Bot/commit/da2e5a6422604638cd4b607d647628ab450b641b.

avery-crudeman commented 6 years ago

Thanks! This is helpful.

john-b-edwards commented 6 years ago

Todd, maybe you can help - I keep encountering a 404 URL error, and I can't figure out where it's coming from.

Preparing to post pregame thread for Game 1 ... Suppressing pregame thread for Game 1 because game thread will be posted soon... Game 1 thread already posted, getting submission... HTTP Error 404: Not Found 23 12:35:36 PM Game 1 edits submitted. Sleeping for 5 seconds... HTTP Error 404: Not Found

Edit: to clarify, this is using your fork. ` Edit 2: Very weird, seemed to magically resolve itself. Alright.

toddrob99 commented 6 years ago

@Metlover that 404 error was due to one of the files not being available, probably plays.json. I also noticed today the header was not posting because of missing weather info in plays.json. Once the files are posted around game time, the 404 will stop. Then once the weather info was added to plays.json, the header would have posted. I fixed the header issue in my v5.0.3 branch this morning, but wanted to see it work through today's game before merging into Master. Here's the commit: https://github.com/toddrob99/Baseball-GDT-Bot/commit/d36d7b268d74555c7224dce79caccc8802b3b869.

toddrob99 commented 6 years ago

@Metlover feel free to message me on reddit if you need anything else. Username is toddrob.

john-b-edwards commented 6 years ago

Awesome, thanks. I'll close this since I think we have most of it resolved.

toddrob99 commented 6 years ago

@Metlover no way to message you on github so I'm commenting instead. Sorry for spamming anyone else subscribed. Send me a message on reddit so I can explain a couple things about my fork of the bot.