Closed eddierubeiz closed 3 years ago
Thanks @eddierubeiz !
@MDiMeo and @lsberry16 I'm going to use my discretion to bump this to 'ready' and work on it this week.
For "honors", my inclination is to make it accept only the same HTML that our "description" fields do -- <b>
, <i>
, <a>
and <cite>
only -- other tags will be stripped. (Why? Because that's already the standard code we have, and allowing any html can be problematic, as it can mess up the rest of the page or even present security problems). Any thoughts on this? @eddierubeiz or @lsberry16 , do you have a sense of what HTML tags are used in there, if any others are that would be a problem to strip?
@jrochkind, here's a list of 5284 honors.
If you (or I, in the import code) get rid of the enclosing <p>
tags, I suspect many of our problems with this field will go away.
Yeah, that's weird they all have enclosing
tags, I don't think there's any reason we need/want to preserve that! (although I do recall noticing at least one with italics that we might want to preserve, so the standard description HTML treatment works there).
If I apply the standard description
HTML treatment, I think it will automatically strip
tags on display, although they might still be in the DB. I can show you the one-liner of our current code to use to strip them on ingest too.
@eddierubeiz to get rid of <p>
while leaving short list of allowed tags, on ingest:
DescriptionSanitizer.new.sanitize("<p>This is <i>an italicized title</i>.</p>")
# => "This is <i>an italicized title</i>."
I'm sanitizing the description per your instructions.
In the honors
section for e.g. Manson Benedict , if you do not select the "Show End Date" checkbox, Drupal will store a copy of the start date in the end date field. I have yet to see any trace in the database of the value of that checkbox, which is frustrating. This may merit a ticket of its own.
In the honors section for e.g. Manson Benedict , if you do not select the "Show End Date" checkbox, Drupal will store a copy of the start date in the end date field. I have yet to see any trace in the database of the value of that checkbox, which is frustrating. This may merit a ticket of its own.
Interesting!
So, I've already done the code to display this correctly if we have start and equivalent.
But, in the interests of keeping our data clean -- I am thinking your ingest should probably leave the end blank if it is identical to the start? Since that's how appears to the editor in the drupal microsite; it's sort of a quirk of the implementation that it fills in the 'end' anyway, but if we import it that way, it will show up that way in our app's editing screens, differently than it shows up in drupal editing screens.
What do you think?
That's totally fine; the initial code I wrote did just that, and it's trivial to change it back. If I do come to understand how that checkbox is stored in the database, we can revisit this.
The display of interviewee bio data on the public page could use a couple adjustments in light of the imported data. We may want to modify the OH tab in the admin interface, too.
I think that's it!
For an example, compare: https://oh.sciencehistory.org/oral-histories/benedict-manson https://staging-digital.sciencehistory.org/works/kw52j9080