internetarchive / openlibrary

One webpage for every book ever published!
https://openlibrary.org
GNU Affero General Public License v3.0
5.26k stars 1.4k forks source link

Replace `ia_collection_s` with multi-value string field `ia_collection` #6836

Open cdrini opened 2 years ago

hornc commented 2 years ago

@cdrini what are IA collection strings used for in OL? I haven't looked at this area for some time, but I remember trying to remove this data from OL as it belongs and is best managed by archive.org. The OL stored version could easily get out of sync.

What I'm getting at is if it's going to be renamed, it should be called something that reflects its meaning or function. I see from #6377 that Solr was indexing archive.org user favourite categories??

If the name is correct and OL is storing and indexing all archive.org categories, that has to be a mistake -- OL can't possibly use or know what to do with all archive.org categories.

If OL uses a subset of categories for a purpose, we should store just those (or a sensible format that is meaningful), and name it clearly.

cdrini commented 2 years ago

Yeah, the fav- one was annoying :P We removed those. We've gotten requests in the past from partners to view their IA collections in open library. E.g. to view the library explorer with just marygrove books, for example. That field is exposed for those purposes.