DDMAL / CantusDB

A new site for Cantus Database running under Django.
https://cantusdatabase.org
MIT License
5 stars 6 forks source link

Source Detail: we should stop preserving linebreaks in Description field #1239

Open jacobdgm opened 8 months ago

jacobdgm commented 8 months ago

with #1087, we made a change in the Source Detail template to preserve line breaks in the description field, as well as several other fields.

Some source descriptions include formatting via HTML tags, so this was causing more spacing than necessary. We asked Debra, and she said we should go with styling via HTML tags and not via line breaks.

When a fix for this issue goes to production, we need to let Debra know, so someone can go through and make sure all necessary <p>s and <br>s are included in source descriptions.

annamorphism commented 8 months ago

Summarizing discussion from this morning's meeting: It would be good to figure out why the line breaks are not always retained; possibly this has something to do with copy-pasting from Word (or other word processor hijinks.) Having a solution where html tags have to be put into all entries in the future, and retroactively into a bunch of existing ones, is workable, but not ideal.

jacobdgm commented 8 months ago

I haven't been able to figure out how to download Word, but I just tried a couple of experiments in Google Docs and Apple Pages to test whether pasting into textareas introduces html tags:

In both OldCantus and New, copying and pasting from these applications does not lead to the introduction of html tags. Where bullet points exist in the copied text, they are converted into single lines without bullet points.

Unless Word behaves differently than Google Docs and Apple Pages, or unless different operating systems or browsers handle pasting differently, it seems likely that indexers have manually added these html tags to the descriptions of certain sources over the years.

(for what it's worth, I've used a bunch of different browsers and different operating systems over the years, and I've never seen html tags added when I pasted into a text field. I highly doubt this could account for the html tags we're observing in source descriptions)

annamorphism commented 8 months ago

I don't think the html tags get added with the paste, but something is a little weird about copy pasting. e.g. these two look fairly similar in the editor:

image

but the one copied from Word has a second line break in it (after "the project") when rendered:

image
annamorphism commented 8 months ago

actually, rephrase: I see some related problems about getting html tags inserted when pasting from Word or Outlook from some time back (around 2011). So maybe some sources got tags inserted (automatically) at the last big Cantus update, and since then it's been tag-less?

I don't think it's very likely that somebody writing out their own html would have decided to use a special font for just a few random characters in this description https://cantus.uwaterloo.ca/source/123648, whereas it seems all sorts of likely that they are the result of some sort of weird word-to-html conversion.

jacobdgm commented 8 months ago

I don't think it's very likely that somebody writing out their own html would have decided to use a special font for just a few random characters in this description https://cantus.uwaterloo.ca/source/123648, whereas it seems all sorts of likely that they are the result of some sort of weird word-to-html conversion.

Indeed, this does not seem to be the work of some musicologist. Maybe this occurred when some data was automatically added to OldCantus (not exactly sure what "the last big Cantus update" means, but since the copyright dates at the bottom of the pages on OldCantus start with 2012, this could align with your "around 2011").

So, what do we want to do going forward? We already have a WYSIWYG editor set up for articles that we could potentially put onto Source pages, so that musicologists can use bullet points without having to fiddle with HTML tags. Though this doesn't actually solve the problem of someone needing to go through and manually check the formatting of all the source descriptions.

annamorphism commented 8 months ago

Some source descriptions on OldCantus include tables https://cantus.uwaterloo.ca/source/123610 I've also seen block quotes and maybe a few other interesting formatting cases.

Some places on OldCantus (e.g. the home page) have an "Enable rich text" option. It doesn't seem to be present for source descriptions, but maybe it was at one time?

jacobdgm commented 8 months ago

Some summary of the discussion @annamorphism and I just had:

One possible solution to this situation is to add a wysiwyg editor (like we currently use on Article pages) to Source pages. We need to ask Debra about this, though - this feature may initially have been present but was later removed.

jacobdgm commented 8 months ago

another thing I discovered while poking around just now: clicking on "More information about text formats" on the OldCantus edit-the-mainpage editor brings you to https://cantus.uwaterloo.ca/filter/tips. It shows me different information depending on whether or not I'm logged in.

annamorphism commented 8 months ago

another thing I discovered while poking around just now: clicking on "More information about text formats" on the OldCantus edit-the-mainpage editor brings you to https://cantus.uwaterloo.ca/filter/tips. It shows me different information depending on whether or not I'm logged in.

Interesting, because non-logged in users shouldn't be composing anything anyway! The "not logged in" information is just the plain-text paragraph of the logged-in information, so presumably it changed to only recommend plain text at some point and not bother informing people about wysiwg.

jacobdgm commented 6 months ago

noting that this issue is similar to #1219, and that we can apply a similar approach (checking for html tags before applying the linebreaks decorator) here as we do there.

dchiller commented 1 month ago

After further discussion, we will do what the title of this issue says, and just not preserve linebreaks when the Description is displayed. We will move towards all Source Description formatting being with html tags.

As the first step of this, let's see if we can provide Debra with a list of sources that have no html tags in their description as ones to take a look at before/when this goes live.

dchiller commented 1 month ago

Also, moving this to High Priority so that then Debra, et al can begin harmonizing the actual content of the Description field with this standard approach.

ahankinson commented 1 month ago

Can I make a potentially left-field suggestion?

I find it frustrating with Django apps when I just want to write some prose with no special formatting, and the $%@# text input fields then mush all my text together when it gets saved.

On the other hand, I also know it creates problems when some descriptions allow line breaks, and this gets mixed up with HTML formatting.

On DIAMM, we use Markdown for fields where we might want to use formatting, but we might also just want to write a plain paragraph or two. The advantage is that this gets converted to HTML when being displayed, including the line breaks, so it's a bit of the best of both worlds.

A bonus is that there is a minimal text editor that comes along with it. See the screenshot.

image

If this sounds interesting, I can supply more details.

dchiller commented 1 month ago

I would love it soooo much if we used markdown. Please, detail away!!!!

annamorphism commented 1 month ago

I too am intrigued by this suggestion!