sfu-dhil / wilde

eXist/XQuery app for detecting copying in a collection of XHTML documents.
GNU General Public License v3.0
2 stars 9 forks source link

Formatting problem #118

Closed ccolliga closed 3 years ago

ccolliga commented 3 years ago

Describe the bug Text of the news article is not correctly aligned.

To Reproduce Click on 'https://dhil.lib.sfu.ca/wilde/view.html?f=ea_1665 See misaligned text.

Expected behavior The body text should be aligned.

Screenshots

Capture d’écran, le 2021-04-02 à 10 05 59

Desktop (please complete the following information):

Note that I am seeing similar problems on some other pages.

joeytakeda commented 3 years ago

This seems to be an issue with the source encoding and not the styling—that paragraph isn't wrapped in a <p>:

   <p class="heading" id="ea_1665_0">Dépêches particilières <br />de l'EXPRESS</p> 
   <p id="ea_1665_1">[...]</p> Londres, 6 avril. M. Oscar Wilde a été arrêté dans la soirée d'hier. 
   <p id="ea_1665_2">[...]</p> 

Since it's not in a paragraph tag, it's not being rendered as paragraph by the application (and I would imagine that would mean those untagged parts aren't being captured by the text matching program either, but @ubermichael would know for sure).

It seems that there are a handful of documents that have this issue; I'll post that list with the reports.

joeytakeda commented 3 years ago

Closing this and the related issues now since this is a data issue; I've opened a ticket in the reports repo now.