google / patents-public-data

Patent analysis using the Google Patents Public Datasets on BigQuery
https://bigquery.cloud.google.com/dataset/patents-public-data:patents
Apache License 2.0
539 stars 163 forks source link

Decreasing number of annotations in google patents research in recent batches #88

Open complexly opened 9 months ago

complexly commented 9 months ago

Could someone help me understand why the number of annotations in google patents research are dropping in recent batches?

202208 verison: 59,089,580,018 rows 202212 verison: 60,246,963,593 rows 202304 verison: 63,130,241,301 rows 202307 verison: 48,064,657,811 rows current verison: 41,000,981,833 rows

Data size seems increasing before 202307 but decreasing greatly afterwards. Is it due to model changes/coverage change, or some other reason? And which vesion of data should I rely on more for some analysis? Is the most recent version most trustable? Thanks!