AndrewBennet / ReadingListV1

:books: :iphone: Reading List - an iOS app to track personal reading lists
https://readinglist.app
GNU General Public License v3.0
300 stars 48 forks source link

Book Description contains HTML entities < > #39

Closed johnjkim closed 4 years ago

johnjkim commented 4 years ago

pic

AndrewBennet commented 4 years ago

Thanks for raising this. It looks like this book has HTML data within the Google book description field, which is unusual. See: https://books.google.co.uk/books?id=GVxwDwAAQBAJ

When I request this book's data from the Google Books API, I see the escaped HTML tags that you see: https://www.googleapis.com/books/v1/volumes/GVxwDwAAQBAJ

I'm not sure what the best way of handling this is - if there is anything better than doing nothing! I certainly can't assume that the description text contains escaped HTML tags, as most don't. Stripping them out would also be tricky, as I can't easily tell programatically what the description is meant to look like. What if the escaped tags are meant to be there? For example, if a description was as follows:

Escaping HTML entities is sometimes required. For example, the HTML tag <i> is escaped to `\<i\>`

in this case it should be left alone!

I think the ideal solution in this case is for Google Books to fix their metadata, but I'm not aware of a mechanism to report this to them...