GlobalDataverseCommunityConsortium / dataverse-previewers

A collection of Datafile Previewers that can be configured to work with Dataverse
MIT License
13 stars 39 forks source link

HTML previewer use case: Local Cloud Distance Summary #5

Open pdurbin opened 5 years ago

pdurbin commented 5 years ago

Hi! Over at https://github.com/IQSS/dataverse/issues/5746 we just discussed how there's interest in providing some sort of preview for https://faun.rc.fas.harvard.edu/czucker/Paper_Figures/summary_fig.html that looks like the screenshot below.

Screen Shot 2019-04-10 at 4 37 39 PM

Is something like this possible with the HTML previewer? If not, what sort of changes would need to be made? Thanks!

qqmyers commented 5 years ago

Hmm - I think the first issue would be that the current HTML Previewer strips out scripts and other potentially nefarious content. That could be selectively turned off, but some mechanism would be needed to make that secure, e.g. having some flag on the dataset that was only superuser set-able. (Or a mimetype that only an admin can set, etc.)

Beyond that, if the html and/or javascript libraries are trying to pull content from other files (sounded like it in #5746, but I don't see where), that code would have to be smart enough to find them in dataverse. That could just be a matter of paths, but it's hard to tell without really digging in.

There could be further issues but the current previewer design, which adds the html from the file directly to the DOM, versus putting up an iframe might help there if there are any security issues with javascript in iframes doing things. (If not, an iframe design might be a possible design if the current previewer raises some as yet unidentified issue. For example, the current design won't add any contents from the file into the html previewers header and I see that there's a css and script currently loaded from the header in the url above...)

Definitely a challenge beyond the simple preview of an ~static html document I was aiming for with v1...

pdurbin commented 4 years ago

potentially nefarious content

Yeah, I have similar security concerns.

By the way, another user story just came up: https://github.com/IQSS/dataverse/issues/5746#issuecomment-548922673

shlake commented 2 years ago

Here are a few more "html" files that do not render (or just renders the text of the file) with the Htmlpreview-er, but do display after downloading and viewing in Chrome:

https://gdcc.github.io/dataverse-previewers/previewers/v1.3/HtmlPreview.html?fileid=2702&siteUrl=https://dataverse.lib.virginia.edu&datasetid=2699&datasetversion=1.0&locale=en

https://gdcc.github.io/dataverse-previewers/previewers/v1.3/HtmlPreview.html?fileid=26370&siteUrl=https://dataverse.lib.virginia.edu&datasetid=26366&datasetversion=1.0&locale=en

https://gdcc.github.io/dataverse-previewers/previewers/v1.3/HtmlPreview.html?fileid=36402&siteUrl=https://dataverse.lib.virginia.edu&datasetid=36323&datasetversion=1.2&locale=en

qqmyers commented 2 years ago

Just checking the first of those - the page is one Javascript with the data embedded in the script. So - definitely the same issue w.r.t. allowing it to run. ANd a good case because, while it relies on plotly getting loaded, the data itself is in the file.

For security - might it be sufficient to provide a button on the previewer like "Enable Scripts"?