sagemathinc / cocalc

CoCalc: Collaborative Calculation in the Cloud
https://CoCalc.com
Other
1.17k stars 216 forks source link

Making pdf file public fails if the filename contains a colon #5734

Open williamstein opened 2 years ago

williamstein commented 2 years ago

WORKAROUND: print to html → open html → either use the html as it is or use the browser's "print to PDF" feature

REPORTED BY: N. McNew

haraldschilly commented 2 years ago

ohh, someone just experienced this again. I should check if there is a simple fix, or what's really going on.

haraldschilly commented 2 years ago

well, the problem lies within calling chrome or chrome itself. I tried escaping the colon but no luck. Even running google-chrome --headless ... directly in the terminal with escaped paths doesn't help. More general, I think we should prevent all paths and filenames which include any of these: /, ?, <, >, \, :, *, |, and " … since I am sure this is not the only command line tool that could trip over such filenames.

Alternatively, the only solution here is to change the filenames, e.g. to create a temporary directory in the same filesystem, pointing to the source file → then rendering the pdf to this directory, and finally creating a hardlink in the original directory with the problematic character → cleanup: deleting the temporary directory.

williamstein commented 2 years ago

I agree with your desire to do this:

More general, I think we should prevent all paths and filenames which include any of these: /, ?, <, >, , :, *, |, and " … since I am sure this is not the only command line tool that could trip over such filenames.

Note that we can't actually do that, and no matter what users will still sometimes make such files. E.g., they could upload a zip folder then extract it or use git clone from github, and end up with such files. Thus we will still always have these problems to some extent. That said, it's all a matter of reducing the probability of foot guns, so let's definitely make it so that the UI in cocalc itself strongly discourages (or outright prevents) making such filenames or directories directly.

Your other idea to use a temporary file seems like a good thing to also do... but like you say it will be difficult to do in a way that is 100% bullet proof in all cases.

Another easy thing to do that you didn't mention in this case would be to add a message to the " Save and Download as PDF" modal that appears. Basically, if the path contains any weird characters, add a big warning message: "If your file doesn't print properly, it is probably because the path contains one of these characters: .... Please move your file to a path that doesn't contain one of these characters if you need to print."

haraldschilly commented 2 years ago

yes, well ... and just for the record: another tiny way to fix this is to change the directory where the html→pdf conversion happens. i.e. right now it's the relative path from the home dir, but the process execution could be told to run right in the correct directory with plain filenames. This would have helped, because in the case I saw the problem the directory had the colon, not the filename.