PatentsView / PatentsView-API

BSD 2-Clause "Simplified" License
12 stars 4 forks source link

Strange behavior when locations endpoint is asked for fields across several groups #43

Open crew102 opened 4 years ago

crew102 commented 4 years ago

I noticed that the locations endpoint has some strange behavior when you ask it for fields from multiple "groups" (i.e., the groups in the fields list https://www.patentsview.org/api/location.html#field_list). The example below provides a minimal example where I try to get four fields that exist across four groups. I can get them one by one (iterating over the four fields), but when I ask for them all at once I don't get any results.

# Note, you need to install the development version of patentsview for this
# example to work. Install instructions are at https://github.com/ropensci/patentsview#installation.
library(patentsview)

# Four fields across four different groups
fields <- c(
  "app_country", "appcit_app_number", "assignee_first_name", "cpc_category"
)

# I get data back when I just request one field at a time
lapply(
  fields, function(x) {
    out <- search_pv(
      query = "{\"patent_number\":\"5116621\"}",
      endpoint = "locations", fields = x
    )
    out$data$locations
  }
)
#> [[1]]
#>    applications
#> 1 US, 07/633146
#> 2 US, 07/633146
#> 3 US, 07/633146
#> 4 US, 07/633146
#> 
#> [[2]]
#>   application_citations
#> 1                    NA
#> 2                    NA
#> 3                    NA
#> 4                    NA
#> 
#> [[3]]
#>                                                                                                                                                                                                                                                                                                                                                                                                             assignees
#> 1                                                                                                                                                                                                                                                                                                                                                                                              NA, NA, 355732, 266721
#> 2                                                                                                                                                                                                                                                                                                                                                                                                          NA, 418770
#> 3                                                                                                             NA, NA, NA, NA, NA, NA, NA, Mitsuharu, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, Michihiro, NA, NA, NA, NA, NA, 382287, 400473, 424537, 424521, 223745, 31540, 228806, 449617, 89956, 154906, 312721, 224607, 362422, 26496, 319914, 388631, 151555, 91808, 435483, 336374, 393982, 395807, 14773, 268676
#> 4 Nobuo, NA, NA, NA, Katsuya, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 438500, 208580, 107932, 254267, 455621, 75677, 230390, 28244, 259040, 134943, 359304, 425372, 251726, 72364, 50779, 320071, 300279, 66679, 329152, 164075, 126024, 33651, 177375, 386925, 405566, 25355, 94566, 158880, 45188, 88534, 200166, 47321, 239233, 103790
#> 
#> [[4]]
#>          cpcs
#> 1 inventional
#> 2 inventional
#> 3 inventional
#> 4 inventional

# ...But not when I ask for them all at once
search_pv(
  query = "{\"patent_number\":\"5116621\"}",
  endpoint = "locations", fields = fields
)
#> $data
#> #### A list with a single data frame on a location level:
#> 
#> List of 1
#>  $ locations:List of 4
#>   ..$ : list()
#>   ..$ : list()
#>   ..$ : list()
#>   ..$ : list()
#> 
#> $query_results
#> #### Distinct entity counts across all downloadable pages of output:
#> 
#> total_location_count = 4

Created on 2019-11-25 by the reprex package (v0.3.0)

DiPietroch commented 4 years ago

Hi crew102,

Thank you for bringing this issue to us. The problem you are having seems to stem from the PatentsView package in r rather than the API itself.

To run this query using the API you can use the following query language:

https://www.patentsview.org/api/locations/query?q={"patent_number":"5116621"}&f=["app_country","appcit_app_number","assignee_first_name","cpc_category"]

This will display the results you were looking for without needing to run each field individually.

Please let us know if you have any future issues with the API.

crew102 commented 4 years ago

So it looks like it has something to do with the matched_subentities_only parameter.

When I set matched_subentities_only to false, I get results:

https://www.patentsview.org/api/locations/query?q={"patent_number":"5116621"}&f=["app_country","appcit_app_number","assignee_first_name","cpc_category"]&o={"include_subentity_total_counts":false,"matched_subentities_only":false,"page":1,"per_page":25}&s=

However, when I set matched_subentities_only to true, I don't:

https://www.patentsview.org/api/locations/query?q={"patent_number":"5116621"}&f=["app_country","appcit_app_number","assignee_first_name","cpc_category"]&o={"include_subentity_total_counts":false,"matched_subentities_only":true,"page":1,"per_page":25}&s=

I'm not 100% clear on what this parameter does, but I know that the results of the query shown above used to return results (while it's not returning results now). Am I missing something, or has something changed with the API? Thanks.

DiPietroch commented 4 years ago

Thank you for bringing this to our attention. It looks as though there is an issue with the matched_subentities_only parameter at the moment. We are working to fix this and will update you and close this issue once it has been resolved.