internetarchive / openlibrary

One webpage for every book ever published!
https://openlibrary.org
GNU Affero General Public License v3.0
5.01k stars 1.26k forks source link

Author info missing from API #8144

Closed angelxmoreno closed 11 months ago

angelxmoreno commented 11 months ago

https://openlibrary.org/books/OL32336498M/Atomic_Habits has two authors but https://openlibrary.org/books/OL32336498M.json does not show authors.

Evidence / Screenshot (if possible)

Screen Shot 2023-07-30 at 7 15 50 PM Screen Shot 2023-07-30 at 7 16 40 PM

Relevant url?

https://openlibrary.org/books/OL32336498M/Atomic_Habits https://openlibrary.org/books/OL32336498M.json

Steps to Reproduce

  1. My api client fetched https://openlibrary.org/isbn/0735211299.json
  2. The request was redirected to https://openlibrary.org/books/OL32336498M.json
  3. No author property was found
  4. I manually went to https://openlibrary.org/books/OL32336498M
  5. I noticed the author data is in fact there

What actually happened after these steps? What did you expect to happen?

Details

Proposal & Constraints

I do not know.

Related files

None that I am aware of.

Stakeholders

not sure

tfmorris commented 11 months ago

The HTML page isn't aligned with the JSON endpoint data. It's a mishmash of data from different places.

Authors are associated with works, so you need to follow the link from the edition to the work and look there. In this case: https://openlibrary.org/works/OL17930368W.json

If you want author names too (likely), you'll then need to follow the individual author links to get the names. You'll probably want to cache this information to keep from having to chase the links all the time.

cdrini commented 11 months ago

Yep, this is the nature of this API, I'm afraid. If you want a more complete record, I'd recommend the search.json endpoint. It'll include some edition data, author name, and some work data.

Eg: https://openlibrary.org/search.json?q=edition_key%3AOL32336498M&mode=everything&fields=*,editions

note the fields=*,editions . This will fetch every field in our search engine, as well as the special editions field containing matching editions. I'd recommend removing the * and putting just the fields you care about there to make the output smaller/more performant/easier to work with! eg https://openlibrary.org/search.json?q=edition_key%3AOL32336498M&mode=everything&fields=key,title,subtitle,author_key,author_name,editions

mateusz-bak commented 5 months ago

@cdrini Can you help me with related issue I have?

I query for example: https://openlibrary.org/search.json?q=Pour%20la%20sociologie%20Lahire&limit=10&offset=0&mode=everything&fields=key,title,subtitle,author_key,author_name,editions,number_of_pages_median,first_publish_year,isbn,edition_key,cover_edition_key,cover_i

And the subtitle is not included in the work result, but it is present in the edition result.

I see here on the webpage https://openlibrary.org/works/OL17335916W/Pour_la_sociologie The subtitle is included for the work.

Is the subtitle taken just from the 1st edition? Is there a way to get the subtitle for work?

cdrini commented 5 months ago

Hi @mateusz-bak , that all looks correct. The UI of the work page chooses an edition to promote as the primary and renders the cover/title from that edition, so the subtitle is on the edition in this case.

You can see the "raw" work data by going to the JSON page for the work: https://openlibrary.org/works/OL17335916W.json . There is no subtitle!

mateusz-bak commented 5 months ago

Thank you for the clarification @cdrini

tfmorris commented 5 months ago

Works don't have a subtitle field - ever. If you look at the edit form for a work, it says

Titre Utilisez le format Titre: Sous-titre pour ajouter un sous-titre.

Sometimes people add the subtitle after a colon, other times they don't, but whatever information is available is in the title field.

cdrini commented 5 months ago

Oh actually the stuff after the : is automatically split off into a work subtitle! So although the edit UI for works has only a single field for the title and the subtitle, when the document is saved it splits it out into the two separate fields 👍

tfmorris commented 5 months ago

That's odd that the edit forms are different and don't reflect the underlying data structure. Why is that?

Message ID: @.*** com>

cdrini commented 5 months ago

Not sure! It's been that way since before I got here.