Closed xylo04 closed 9 years ago
Nope, not really. Is this consistent in all browsers?
I was working primarily in Chrome 42; I just tried Firefox 37, with similar (but not identical) errors to the first error. Firefox actually gave me this warning about my index.html, though:
The character encoding of the HTML document was not declared. The document will render with garbled text in some browser configurations if the document contains characters from outside the US-ASCII range. The character encoding of the page must be declared in the document or in the transfer protocol.
So I'll try setting UTF-8 encoding either though headers or pre-processor tags and see if that will clear up the first error.
The second Could not undump
error seems to happen often on the second time compiling. (I'm instantiating new PDFTeX object each time as recommended elsewhere, so that shouldn't be the issue.)
I have a reliable reproduction case for at least the Could not undump
error, but it's in a prototype I'm not ready to publish yet. Can you email me, and I'll provide details? I just made my email address available on my profile.
You guys at Google...
I'd prefer an open discussion. Are these details really so confidential?
Not particularly, but I have to have approval before releasing open source (technically, even patches/pull requests), yada yada... I'll be very happy once I am able to open source my project so I don't have to keep tiptoeing around red tape!
You can find the prototype here; it will open a tex file in Google Drive that I've shared publicly. You'll have to authorize my app to use a Google account of yours, but only files you explicitly authorize will be readable. After that, the file's content should show up in the left pane; clicking the red button should begin compiling, and the generated PDF should be previewed in the right pane.
Clicking the red button a second time causes the could not undump
error, as does reloading the page and clicking the red button for the first time in the new session. I'm able to reproduce this in Chrome and Firefox. Clearing the browser's cache makes compiling work again. This behavior is different from what I experience using the Node http-server locally: in that environment, I'm able to compile time after time to my heart's content.
Note this is still a prototype and has some rough edges, not to mention I'm a backend engineer by trade and it's been a while since I've done much JavaScript!
EDIT: I've moved the prototype here; if you see an SSL warning page, you simply need to type 'danger' in the window to bypass.
Did you compile the latex compiler to js yourself or did you use my version?
I haven't tried compiling it myself yet. I used the github copies.
This error is really weird. I'd say it's probably that your server is serving corrupted files (or files that are interpreted incorrectly). I'd check mime types and maybe try to use PDFTex.FS_readFile()
to check if the file contents of e.g. latex.fmt
isn't corrupted.
No luck yet. I've noted that App Engine is not sending a Content-Length
header, which according to some Googling is OK because it's supposed to use Transfer-Encoding: chunked
instead - only I don't see it doing that either. So perhaps it's an App Engine issue. I feel like I saw the undump
error on a different server before, but I don't remember which one, maybe Apache.
Another difference that could possibly explain it is gzip. I would think whatever's doing the XHR to load e.g. latex.fmt or article.cls would be smart enough to ungzip, or that the browser would do it before it's handed back, but if that wasn't the case it would explain why the files look corrupted.
yeah, it should ungzip it automatically. But have you tried to enable gzip on your nodejs server just to be sure?
Well, I finally started using FS_readFile
properly and found that latex.fmt
(which is not transferred with gzip) is being loaded into the Emscripten file system correctly, but minimal.cls
is not correct; it's supposed to be 2028 bytes long, but when served by AppEngine it's being truncated to 1032 bytes in the filesystem, which is suspiciously close to the 1039 bytes that were transferred over the wire while it was gzipped. The content of the file is plaintext (not gzipped) but truncated. I'm still trying to see what I can do to enable gzip locally, or disable gzip on AppEngine, to verify that's the root cause.
Wups, sounds like a rather serious bug in the AppEngine. Maybe this helps debugging: https://cloud.google.com/appengine/kb/general
We use a combination of request headers (Accept-Encoding, User-Agent) and response headers (Content-Type) to determine whether or not the end-user can take advantage of gzipped content. This approach avoids some well-known bugs with gzipped content in popular browsers. To force gzipped content to be served, clients may supply 'gzip' as the value of both the Accept-Encoding and User-Agent request headers. Content will never be gzipped if no Accept-Encoding header is present.
Alright, I can report some progress!
I was able to find a way to get AppEngine to serve all of the TexLive resources, .cls
files in particular, with application/octet-stream
which appears to disable gzip. Now that I've done that, I'm able to compile reliably the first time I load the page.
The second time I compile, I get back to the could not undump
error, and with the FS_readFile
logging in place, I can see that the file doesn't exist in the Emscripten file system. Looking back at the Network tab, AppEngine is deciding to return a 304 Not Modified
instead of re-serving the file. My next step is to see if I can disable that behavior, and always serve the file content.
I suspect Emscripten is doing something non-standard to fetch files from the network and populate them in the virtual file system. If that were the case, it would explain why gzipped responses are truncated, and why cached responses end up as empty files in the file system.
EDIT: Just saw your response, and yep, could be an AppEngine bug as well. At very least, they seem to have a non-standard default configuration and I'm having one heck of a time configuring around it.
I posed the question to emscripten-discuss.
Huh, I'm no longer seeing issues with 304 Not Modified
responses. It looks like App Engine is no longer serving them, always serving 200's with content? I'm not sure what changed, but it's working at the moment.
I'm still not sure how I configured around this, but until further notice I think it's safe to assume that the following holds true when serving texlive.js and probably all Emscripten libraries:
really strange, did you try to setup gzip and/or caching on your nodejs setup?
Unfortunately I haven't taken the time to confirm it properly that way. I should, and I'll see if I can make some time to do so, but that's my current feeling.
I'm having luck running my app and compiling TeX on my localhost using the node.js http-server, but in other server configurations (python SimpleHTTPServer, Google App Engine), I tend to get errors like:
Or sometimes:
This second one is seemingly fixed if I clear my browser's cache. Based on that, I feel like maybe it's a header issue, maybe encoding, but I'm not sure.
Are you aware of any server configuration tricks that need to be addressed for texlive.js to work properly?