edrlab / thorium-reader

A cross platform desktop reading app, based on the Readium Desktop toolkit
https://www.edrlab.org/software/thorium-reader/
BSD 3-Clause "New" or "Revised" License
1.75k stars 152 forks source link

OPDS text/html description shows raw markup instead of rendered HTML #1376

Closed danielweck closed 3 years ago

danielweck commented 3 years ago

unnamed

llemeurfr commented 3 years ago

The chosen solution is to sanitize the html. The resulting HTML can therefore be written into a DOM element.

llemeurfr commented 3 years ago

The evolution ends-up with a duplicate description. Set the English Feedbooks feed in Thorium; choose a book in Featured Picks (I chose The Free World); open the Info Panel. We see a duplicate description, the first is raw text, the second is html.

The source XML structure is https://catalog.feedbooks.com/item/3952857.atom, where <summary> is raw text and <content type="html"> is html.

In the code I don't see the summary used, but I may have missed it.

The heuristic must be:

Check https://tools.ietf.org/html/rfc4287#page-16 in case you need details.

panaC commented 3 years ago

from opds1 to opds2 converter: https://streamer-aghvbovdia-ew.a.run.app/opds-v1-v2-convert/https%3A%2F%2Fcatalog.feedbooks.com%2Fitem%2F3952857.atom

""An engrossing and impossibly wide-ranging project . . . In The Free World, every seat is a good one." ?Carlos Lozada, The Washington Post"The Free World sparkles. Fully original, beautifully written . . . One hopes Menand has a sequel in mind. The bar is set very high." ?David Oshinsky, The New York Times Book ReviewNamed a most anticipated book of April by The New York Times | The Washington Post | Oprah DailyIn his follow-up to the Pulitzer Prize?winning The Metaphysical Club, Louis Menand offers a new intellectual and cultural history of the postwar yearsThe Cold War was not just a contest of power. It was also about ideas, in the broadest sense?economic and political, artistic and personal. In The Free World, the acclaimed Pulitzer Prize?winning scholar and critic Louis Menand tells the story of American culture in the pivotal years from the end of World War II to Vietnam and shows how changing economic, technological, and social forces put their mark on creations of the mind. How did elitism and an anti-totalitarian skepticism of passion and ideology give way to a new sensibility defined by freewheeling experimentation and loving the Beatles? How was the ideal of ?freedom? applied to causes that ranged from anti-communism and civil rights to radical acts of self-creation via art and even crime? With the wit and insight familiar to readers of The Metaphysical Club and his New Yorker essays, Menand takes us inside Hannah Arendt?s Manhattan, the Paris of Jean-Paul Sartre and Simone de Beauvoir, Merce Cunningham and John Cage?s residencies at North Carolina?s Black Mountain College, and the Memphis studio where Sam Phillips and Elvis Presley created a new music for the American teenager. He examines the post war vogue for French existentialism, structuralism and post-structuralism, the rise of abstract expressionism and pop art, Allen Ginsberg?s friendship with Lionel Trilling, James Baldwin?s transformation into a Civil Right spokesman, Susan Sontag?s challenges to the New York Intellectuals, the defeat of obscenity laws, and the rise of the New Hollywood. Stressing the rich flow of ideas across the Atlantic, he also shows how Europeans played a vital role in promoting and influencing American art and entertainment. By the end of the Vietnam era, the American government had lost the moral prestige it enjoyed at the end of the Second World War, but America?s once-despised culture had become respected and adored. With unprecedented verve and range, this book explains how that happened.

        <p><p><b>"An engrossing and impossibly wide-ranging project . . . In <i>The Free World</i>, every seat is a good one." ?Carlos Lozada, <i>The Washington Post</i></b><br><br><b>"<i>The Free World</i> sparkles. Fully original, beautifully written . . . One hopes Menand has a sequel in mind. The bar is set very high." ?David Oshinsky, <i>The New York Times Book Review</i><br><br>Named a most anticipated book of April by <i>The New York Times </i>| <i>The Washington Post</i> | <i>Oprah Daily</i></b><br><b><br>In his follow-up to the Pulitzer Prize</b><b>?winning <i>The Metaphysical Club</i>, Louis Menand offers a new intellectual and cultural history of the postwar years</b><br><br>The Cold War was not just a contest of power. It was also about ideas, in the broadest sense?economic and political, artistic and personal. In <i>The Free World</i>, the acclaimed Pulitzer Prize?winning scholar and critic Louis Menand tells the story of American culture in the pivotal years from the end of World War II to Vietnam and shows how changing economic, technological, and social forces put their mark on creations of the mind. <br><br>How did elitism and an anti-totalitarian skepticism of passion and ideology give way to a new sensibility defined by freewheeling experimentation and loving the Beatles? How was the ideal of ?freedom? applied to causes that ranged from anti-communism and civil rights to radical acts of self-creation via art and even crime? With the wit and insight familiar to readers of <i>The Metaphysical Club</i> and his <i>New Yorker </i>essays<i>,</i> Menand takes us inside Hannah Arendt?s Manhattan, the Paris of Jean-Paul Sartre and Simone de Beauvoir, Merce Cunningham and John Cage?s residencies at North Carolina?s Black Mountain College, and the Memphis studio where Sam Phillips and Elvis Presley created a new music for the American teenager. He examines the post war vogue for French existentialism, structuralism and post-structuralism, the rise of abstract expressionism and pop art, Allen Ginsberg?s friendship with Lionel Trilling, James Baldwin?s transformation into a Civil Right spokesman, Susan Sontag?s challenges to the New York Intellectuals, the defeat of obscenity laws, and the rise of the New Hollywood. <br><br>Stressing the rich flow of ideas across the Atlantic, he also shows how Europeans played a vital role in promoting and influencing American art and entertainment. By the end of the Vietnam era, the American government had lost the moral prestige it enjoyed at the end of the Second World War, but America?s once-despised culture had become respected and adored. With unprecedented verve and range, this book explains how that happened.</p></p>",
danielweck commented 3 years ago

Thanks for the RFC Laurent. The problem is that the RWPM JSON model is less feature-complete than OPDS1/Atom, specifically we are loosing content-type, and XML/XHTML namespaces (i.e. we store a simple string of characters).

danielweck commented 3 years ago

Thanks for the link Pierre. This issue will be fixed in the r2-opds-js lib, where content / summary is parsed and concatenated.

llemeurfr commented 3 years ago

specifically we are loosing content-type, and XML/XHTML namespaces (i.e. we store a simple string of characters).

I believe we don't need to keep the XHTML namespace here and can consider that any tagging is html based. The only thing to keep in mind is that if Atom content is xhtml, it is not escaped in the content field.

danielweck commented 3 years ago

Fixed via https://github.com/readium/r2-opds-js/commit/e0e87058307157a0c6f13255a3f1e4d2aa2e7a70

Screenshot 2021-05-12 at 19 42 54
danielweck commented 3 years ago
Screenshot 2021-05-12 at 19 46 56

https://catalog.feedbooks.com/item/2516638.atom ==>

  <content type="html">&lt;p&gt;&lt;b&gt;A newly updated edition for the fast-changing real estate market in Canada!&lt;/b&gt;&lt;br&gt;&lt;br&gt;&lt;/p&gt;

&lt;p&gt;Over the last two decades Canadians have become convinced that real estate is the
&lt;br /&gt;?safe haven? investment. This widely held belief and obsession with real estate led millions
&lt;br /&gt;of Canadians to take on massive amounts of debt ? tripling their collective financial
&lt;br /&gt;burden ? ensuring that Canada is one of the most indebted nations on the planet.&lt;br&gt;&lt;br&gt;&lt;/p&gt;

&lt;p&gt;Drawing on dozens of interviews and even more conversations with individual
&lt;br /&gt;Canadians and couples, this second edition also tackles the economic
&lt;br /&gt;conditions and regulatory rules that allowed such a dangerous situation
&lt;br /&gt;to develop in Canada, formerly a nation of conservative and prudent citizens.&lt;br&gt;&lt;br&gt;&lt;/p&gt;

&lt;p&gt;Hilliard MacBeth argues that Canada is in the midst of an unprecedented
&lt;br /&gt;real estate bubble and that there will soon be a crash in house prices,
&lt;br /&gt;triggering a financial crisis. Individual Canadians and families can still take action to
&lt;br /&gt;protect themselves from the fallout of the bubble bursting ? if they act quickly.&lt;/p&gt;</content>