gu-gridh / litteraturlabbet-frontend

The frontend view of the Litteraturlabbet application at GRIDH
0 stars 0 forks source link

Passim: check end of page #52

Closed siskahumlesjo closed 5 months ago

siskahumlesjo commented 10 months ago

Check end of the page if there is any reuses in next page or vice versa to make it clean and have access to whole reuesed text

jonathanwestin commented 10 months ago

The main idea is that if we can stitch together those passages that are at the end of a page and start of a page, that are really just belonging to the same reuse but have been divided into to separate reuses by the algorithm, then we can clean away a lot of reuses.

daalft commented 10 months ago

Should we maybe run passim on whole books? Right now we run it on separate pages.

jonathanwestin commented 10 months ago

If possible!

daalft commented 10 months ago

I wonder if we still have access to the page numbers easily then... I will investigate

jonathanwestin commented 10 months ago

probably not easily, but I presume we would have to stitch together all the pages anyway and then we can perhaps add some tags that indicate original page...