Closed stopregionblocking closed 1 month ago
Hi, I'm interested in working on this issue. Can someone give me some pointers on how to get started?
Hi, @krthkmgndm, it sounds as if this issue may be a bit too daunting, as the current steps outlined are the best I can do, unfortunately. It may make more sense to try a Good First Issue: https://github.com/internetarchive/openlibrary/issues?q=is%3Aissue+is%3Aopen+label%3A%22Good+First+Issue%22+-linked%3Apr.
It's also worth checking out https://github.com/internetarchive/openlibrary/tree/master/docker#welcome-to-the-docker-installation-guide-for-open-library-developers.
Hi @scottbarnes. I would like to work on this. I have an idea on how to tackle this issue :) ty!
Problem
The feature which allows Amazon.com records to be imported into OpenLibrary via searching for ISBNs will, unfortunately, import DVDs with ISBNs from Amazon.com, even if they are clearly indicated as not being books.
This was confirmed using the example of https://www.amazon.com/gp/product/1621064298
Reproducing the bug
Context
Breakdown
The solution will likely involve some minor modifications to https://github.com/internetarchive/openlibrary/blob/master/openlibrary/core/vendors.py so that DVDs don't return values to
get_products()
.For the above DVD/item,
get_products()
returns:[{'url': 'https://www.amazon.com/dp/1621064298/?tag=internetarchi-20', 'source_records': ['amazon:1621064298'], 'isbn_10': ['1621064298'], 'isbn_13': ['9781621064299'], 'price': '$15.19', 'price_amt': 1519, 'title': 'Homeland Insecurity: Films by Bill Brown', 'cover': 'https://m.media-amazon.com/images/I/41FuCUj3kUL._SL500_.jpg', 'authors': [{'name': 'Brown, Bill'}], 'publishers': ['Microcosm Publishing'], 'number_of_pages': None, 'edition_num': None, 'publish_date': 'Aug 01, 2007', 'product_group': 'DVD', 'physical_format': 'dvd'}]
.Of interest is
product_group
andphysical_format
. To complete this issue one would likely want to look at https://webservices.amazon.com/paapi5/documentation/ and determine why we should useproduct_group
,physical_format
, both, either, or something else to determine something is a DVD. Or maybe it's better to focus on what is allowed (e.g. books).In any event, we'll likely want to want to modify
serialize()
orget_product()
to filter out DVDs (or only allow whatever constitutes books, if the cases are clear).Requirements Checklist
product_group
,physical_format
, both, either, or something else to determine something is a DVD.physical_format
) such thatget_product()
orserialize()
will no longer return metadata for DVDS (e.g. if modifyingserialize()
, it should return{}
--serialize()
may be the better option).Related files
Stakeholders
*
Instructions for Contributors