pulibrary / pdc_discovery

Princeton Data Commons discovery portal for Research Data
10 stars 0 forks source link

Bug: Query by ORCID returns a longer list that doesn't include the search author #550

Closed astrochun closed 5 months ago

astrochun commented 7 months ago

I had noticed when doing an ORCID ID query that the list of datasets is not limited to just the author.

Here's an example query: https://datacommons.princeton.edu/discovery/?&q=0000-0001-9636-8181&search_field=

There are several with Jacob Schwartz. However, later datasets don't have him listed.

Acceptance Criteria:

carolyncole commented 5 months ago

Apologies for the late response. This bug got lost in the shuffle. @astrochun You can add quotes around the ORCID and get a smaller list. We have a separate search type to deal with ORCIDs as just the Author https://datacommons.princeton.edu/discovery/?search_field=orcid&q=0000-0001-9636-8181. I do not believe there is not much to fix here.

astrochun commented 5 months ago

I see that quotes do help, but do know that this query came about when using the "Find other works by this author" feature within Discovery by clicking on the individual. So if we can modify the query from there, the results would be what we expect.

carolyncole commented 5 months ago

@astrochun in an "All fields" search the ORCID is used as a text match to any field in the work. If you select ORCID in the drop down or put quotes around the ORCID the entire ORCID is matched. Semantically words that have dashes can be considered two tokens. So I assume the other matches are because someone in the other work's has a partial match to the user's ORCID aka, the '0000' matches another author or the '0001' matches part of another author. We have a special search for ORCID. I would suggest that is the best way to search the system for ORCIDs.

carolyncole commented 5 months ago

https://datacommons.princeton.edu/discovery/?search_field=orcid&q=%220000-0001-9636-8181%22 https://datacommons.princeton.edu/discovery/?search_field=all_fields&q=%220000-0001-9636-8181%22 https://datacommons.princeton.edu/discovery/?search_field=orcid&q=0000-0001-9636-8181 all look like the correct 6 results to me.

astrochun commented 5 months ago

@carolyncole, perhaps this screenshot will help to explain how the search went wrong. This is the "by this author" link that I clicked on. Screen Shot 2024-03-11 at 6 28 35 AM

carolyncole commented 5 months ago

@astrochun Ah! Yes, that makes it clear. We have the wrong link there... Thank you!