Closed john-b-edwards closed 6 years ago
Weird. I just noticed this myself. I wonder what the cause is.
On Feb 22, 2018 10:48 PM, "Metlover" notifications@github.com wrote:
I realized in setting Metsbot up for the season that the GD2 url cannot use a trailing slash. I.e. http://gd2.mlb.com/components/ game/mlb/year_2018/month_02/day_23/ does not work, but http://gd2.mlb.com/components/game/mlb/year_2018/month_02/day_23 does. Weirdly enough, the trailing slash is still required for other elements of the database: i.e. http://gd2.mlb.com/components/ game/mlb/year_2018/month_02/day_22/gid_2018_02_22_bocbbc_bosmlb_2/ works, but http://gd2.mlb.com/components/game/mlb/year_2018/month_02/ day_22/gid_2018_02_22_bocbbc_bosmlb_2 does not. I'm trying to get a fix up for myself but this seems like an important but relatively minor issue.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/mattabullock/Baseball-GDT-Bot/issues/41, or mute the thread https://github.com/notifications/unsubscribe-auth/AEiu6FA1yy675a1wrfV0rTLd5TCSFxweks5tXjTygaJpZM4SQWMn .
Some of the saber community noticed a couple weeks ago. Really frustrating because it breaks a lot of scrapers. https://www.reddit.com/r/Sabermetrics/comments/7v07ax/mlb_gameday_data_sites_down_went_private/
Hurm. I've removed the trailing slash, but this section of main.py:
for v in html:
if self.TEAM_CODE in v:
v = v[v.index("\"") + 1:len(v)]
v = v[0:v.index("\"")]
directories.append(url + v)
appends "-//W3C//DTD HTML 3.2 Final//EN" instead of the gid for the game.
Feel free to delete this comment if it's not appropriate (or let me know and I will), but I've got this working on my fork: https://github.com/toddrob99/Baseball-GDT-Bot. My fork uses grid.json to find the games, rather than reading the lines from the directory listing.
If you remove the trailing slash from the url var, you have to include it anywhere you concatenate anything on the end. For example directories.append(url + "/" + v)
. I don't remember if the url appears in editor.py anywhere in this fork, but if so, it needs to be updated there too.
I think you'll also find a new breakage where editor.py tries to download gamecenter.xml before it's posted. It used to fail gracefully, but now the gd2 server throws a new error. This impacted my fork, but I may have been downloading it differently.
Here's the commit where I fixed both of these issues (among other things, and there was one more commit after this fixing a couple other things that probably don't apply to this fork): https://github.com/toddrob99/Baseball-GDT-Bot/commit/da2e5a6422604638cd4b607d647628ab450b641b.
Thanks! This is helpful.
Todd, maybe you can help - I keep encountering a 404 URL error, and I can't figure out where it's coming from.
Preparing to post pregame thread for Game 1 ...
Suppressing pregame thread for Game 1 because game thread will be posted soon...
Game 1 thread already posted, getting submission...
HTTP Error 404: Not Found
23 12:35:36 PM Game 1 edits submitted. Sleeping for 5 seconds...
HTTP Error 404: Not Found
Edit: to clarify, this is using your fork. ` Edit 2: Very weird, seemed to magically resolve itself. Alright.
@Metlover that 404 error was due to one of the files not being available, probably plays.json. I also noticed today the header was not posting because of missing weather info in plays.json. Once the files are posted around game time, the 404 will stop. Then once the weather info was added to plays.json, the header would have posted. I fixed the header issue in my v5.0.3 branch this morning, but wanted to see it work through today's game before merging into Master. Here's the commit: https://github.com/toddrob99/Baseball-GDT-Bot/commit/d36d7b268d74555c7224dce79caccc8802b3b869.
@Metlover feel free to message me on reddit if you need anything else. Username is toddrob.
Awesome, thanks. I'll close this since I think we have most of it resolved.
@Metlover no way to message you on github so I'm commenting instead. Sorry for spamming anyone else subscribed. Send me a message on reddit so I can explain a couple things about my fork of the bot.
I realized in setting Metsbot up for the season that the GD2 url cannot use a trailing slash. I.e. http://gd2.mlb.com/components/game/mlb/year_2018/month_02/day_23/ does not work, but http://gd2.mlb.com/components/game/mlb/year_2018/month_02/day_23 does. Weirdly enough, the trailing slash is still required for other elements of the database: i.e. http://gd2.mlb.com/components/game/mlb/year_2018/month_02/day_22/gid_2018_02_22_bocbbc_bosmlb_2/ works, but http://gd2.mlb.com/components/game/mlb/year_2018/month_02/day_22/gid_2018_02_22_bocbbc_bosmlb_2 does not. I'm trying to get a fix up for myself but this seems like an important but relatively minor issue.