google / patents-public-data

Patent analysis using the Google Patents Public Datasets on BigQuery
https://bigquery.cloud.google.com/dataset/patents-public-data:patents
Apache License 2.0
539 stars 163 forks source link

confidence>1 for annotations in google patent research #89

Open complexly opened 9 months ago

complexly commented 9 months ago

In the current google patent research annotation, confidence (and the corresponding conf_bucket) are sometimes >1 (>1000). Is there some numerical or formatting error behind? If not, how should I interpret the >1 confidence score? Thanks!

Some sample from big query with confidence>1

sampleid|publication_number|confidence|conf_bucket| 1 | US-6818775-B2 | 1.05 | 1050 |   2 | AU-2017219004-B2 | 1.0999999 | 1099 |   3 | US-2016046635-A1 | 1.1999998 | 1199 |   4 | US-9643971-B2 | 1.1999998 | 1199 |   5 | ES-2438576-T3 | 1.05 | 1050 |   6 | JP-WO2007013641-A1 | 1.05 | 1050