joenano / rpscrape

Scrape horse racing results data and racecards.
143 stars 59 forks source link

Issues scraping data for the day #118

Closed Swebbee closed 2 years ago

Swebbee commented 2 years ago

Not sure if this is just me or if anyone else is having these problems. Some days will work, some will not. Last successful scrape was for 17th July data GB.

If I try for the 18th and other dates I now get the following.

Traceback (most recent call last): File "C:\Users**\rpscrape\scripts\rpscrape.py", line 180, in main() File "C:\Users**\rpscrape\scripts\rpscrape.py", line 176, in main scrape_races(races, args['folder_name'], args['file_name'], file_extension, args['type'], file_writer) File "C:\Users**\rpscrape\scripts\rpscrape.py", line 112, in scrape_races race = Race(url, doc, code, settings.fields) File "C:\Users**\rpscrape\scripts\utils\race.py", line 61, in init pedigree = Pedigree(xpath(self.doc, 'tr', 'block-pedigreeInfoFullResults', fn='/td')) File "C:\Users**\rpscrape\scripts\utils\pedigree.py", line 15, in init self.pedigree_info() File "C:\Users**\rpscrape\scripts\utils\pedigree.py", line 59, in pedigree_info self.id_sires.append(ped_info[0].attrib['href'].split('/')[3]) IndexError: list index out of range

Swebbee commented 2 years ago

Searching through some of the results on RP they look to be corrupted. So no panic, probably clear itself soon. I blame the hot weather.

sfr14 commented 2 years ago

Did you manage to fix this issue? I'm having the same problem today

edit - oh, I understand. It's the abandoned meetings that caused it. If anyone knows a way of scraping the results from the races that did go ahead that would be great

Swebbee commented 2 years ago

It is not just the abandoned meetings, other races have corrupted results. Has happened before and takes a few days for RP to get around to fixing it.