joenano / rpscrape

Scrape horse racing results data and racecards.
144 stars 60 forks source link

Duplicate Course Names #30

Closed PatrickSchutte closed 3 years ago

PatrickSchutte commented 3 years ago

Hi 4A47,

I have run into a small snag. It seems like there are some duplicate course names from other countries so it might be best to allocate the course id or some other feature to make them unique?

for example: ' "saf": [ "988 - Arlington", "994 - Ascot", "515 - Bloemfontein",' and ' "gb": [ "32 - Aintree", "2 - Ascot", "3 - Ayr",'

The csv files produce:

"2020-02-15,ascot,1:15,Thames Materials...."

Regards Patrick

joenano commented 3 years ago

Not sure what the problem is, all courses have a unique id, from the example you posted 994 for ascot saf and 2 for ascot gb

PatrickSchutte commented 3 years ago

The problem is that the CSV that is created only has the word ascot in it , so it is is bit difficult to figure out from which country the race comes from.

joenano commented 3 years ago

I think I know what you mean, you want a course id column.

PatrickSchutte commented 3 years ago

Apologies for being too terse, yes, I am running the command "python rpscrape.py -d 2020/09/01-2020/11/18" and of course it slots them all into a single CSV file only giving the course name so it becomes a bit difficult to try and figure out from which country the race was run. Thank you for being patient with me.

joenano commented 3 years ago

I have added a region column before the course name.

So you will see now

Date, SAF, Ascot, Off Date, GB, Ascot, Off

No idea why it has taken this long, should have been in from the start, thanks for bringing this up.