gdcc / dataverse-previewers

A collection of Datafile Previewers that can be configured to work with Dataverse
MIT License
5 stars 15 forks source link

HTML previewer not rendering images #31

Closed kmika11 closed 7 months ago

kmika11 commented 1 year ago

Images and figures fail to render when previewing html files.

A depositor is hoping to publish an html file of a 3d graph that will allow users to engage with and manipulate the image within the previewer. However, when deposited, the file is recognized as an HTML file and triggers the "Explore on View Html" button

Screenshot 2023-05-02 at 12 28 27 PM

When clicked, the "Preview Html" window simply displays the filename and dataset

Screenshot 2023-05-02 at 12 28 37 PM

This also seems to be happening with other html files, but for images only. In this example from Harvard Dataverse: https://dataverse.harvard.edu/file.xhtml?persistentId=doi:10.7910/DVN/3SLMJY/LTUGXH, the text and code blocks are successfully previewed, but the image at the bottom does not render.

Screenshot 2023-05-02 at 12 29 37 PM
phillipsjs commented 1 year ago

Just wanted to follow up on this and see if any progress has been made - the journal article will be published soon and we would love it if the 3d image could be viewed on dataverse. Thanks!!

pdurbin commented 1 year ago

@phillipsjs unfortunately, while @JR-1991 @qqmyers @kmika11 and I had a robust internal discussion on this, it comes down to security.

One idea from @qqmyers is to NOT show the plot immediately but offer a button so that people can opt into running the Javascript.

@kmika11 suggested whitelisting Plotly.

In short, while I'm sure your intentions are true, Harvard Dataverse is very open. We often take down spam and other junk. We need to be careful. Other Dataverse installations may be more tightly controlled.

Is your file the ft_space_3d.html file we were playing with? That one used Plotly. @JR-1991 had a nice one called CarsAndPTWsWorldwide.html. I'd post screenshots but I'm sure if these are unpublished or not.

kmika11 commented 1 year ago

@pdurbin & @qqmyers - is the button option on the user end? So users that want to preview the plot could click a button to run the Javascript after viewing a warning? I think that makes sense to manage some of the security risks but not restricting the preview to only plotly and other hard coded/whitelisted libraries

qqmyers commented 1 year ago

Yes.

Now that we have signedURLs for tools, I'd definitely recommend using for this previewer, along with the warning. Using a signedURL would mean the Javascript, if it were malicious, would never have access to the user's api token, so it wouldn't be able to use that to make other calls as that user in an attempt to damage Dataverse/ that user's data in the Dataverse, etc. and the risk would really be to the user - a malicious Javascript could do anything bad that a malicious Javascript on any page could do to them.

That would mean the user warning would just have to say something like "This data file includes JavaScript which may need to run for the data to display properly. You can click the button to allow the JavaScript to run, but be sure you trust this datafile as a malicious JavaScript could harm your computer (with the same concerns as if you went to a malicious website outside of Dataverse)." (With signedURLs, we wouldn't need to also say that a malicious script could delete any data you may have in this Dataverse instance, create spam datasets using your account, etc.)

phillipsjs commented 1 year ago

Hi, just returning to this to say that from our perspective, having a warning and a clicking a button to allow for the javascript to execute would be perfectly fine! Sounds like that might be a good solution all-around!

JR-1991 commented 1 year ago

Adding a confirm and subsequent if-statement in html.js plus not stripping the script tag should do the work.

Is there something to be taken care of in xss.js?

What do you think? Happy to submit a PR on Monday 🙂

pdurbin commented 1 year ago

I'd be happy to see a PR. 😄

qqmyers commented 1 year ago

Yeah - hopefully straight forward so it would be great to have something people can try. xss,js is an external lib so no change there. I'm not sure if we should replace the existing HTML previewer or make a new InteractiveHTML previewer (or something) and maybe keep it out of the recommended/non-experimental list for now. (Just don't want an admin to cut/paste our list and install something that could cause issues without understanding the risks of allowing javascript.)

pdurbin commented 1 year ago

Yeah, I was sort of thinking a second POTENTIALLY DANGEROUS HTML previewer might be nice, so people installing them have options. 😄

phillipsjs commented 1 year ago

Really excited to hear that this will likely work! Thanks all :)

JR-1991 commented 1 year ago

I got it to work using the Plotly example from our Dataverse installation. @qqmyers I needed to remove the filterXSS for it to work. Is this due to the particular case of Plotly? Is it too dangerous to leave that one out? Guess most plotting frameworks rely on special CSS.

Checking with some other HTML files and will submit a PR today.

InsecureHTMLPreview
qqmyers commented 1 year ago

XSS is just removing potentially dangerous tags (like Githubissues.

  • Githubissues is a development platform for aggregating issues.