PatentsView / PatentsView-DB

34 stars 15 forks source link

Wrong assignee/inventor coordinates #92

Open WWakker opened 3 years ago

WWakker commented 3 years ago

I have checked all assignee and inventor coordinates from 1976 to June 2020 to see how reliable this data is, by checking all coordinates against a country bounding box (a square withNorth, East, South, West bounds). Turns out, not so much; there are many errors (223,195 to be precise).

Here are some of the problems that I found:

ASSIGNEE:

INVENTOR:

CA: California instead of Canada SC: Scotland instead of Seychelles

For example, here's a plot of assignee and inventor coordinates that have country codes "CA" outside of the bounding box of Canada: 2_CA_34111_errors

I'm sharing a CSV file with all errors that I found in a zip file, in case you want to look into this. coords_outside_bounding_box.zip

emelluso commented 3 years ago

Thank you, @WWakker, we are investigating a replacement for the lat/long lookup data as it is not consistently accurate for non-U.S. locations. Looking for alternative sources of location data to better improve the accuracy of this visualization!

WWakker commented 3 years ago

Thanks for looking into this!

However, I wouldn't say it is consistently accurate for US locations either. It's in ninth place with regards to number of coordinates outside of the bounding box according to my analysis. Although, most patents are based in the US so relative to the total number of patents based in the US it is indeed a lot better. 9_US_6964_errors