Open kouk opened 9 years ago
There's also a discussion here which might be related: https://github.com/substack/html-tokenize/issues/13
But all this isn't really future proof, it would be great if somehow we can call ts.end()
or node.abort()
and have it kill all the streams that might be blocking.
I'm running in the same issue using this gist. If you uncomment line 45 and comment 44 out, it won't stop loading and you can't porceed with anything. Couldn't find any solution to this point and can't really think of any way to stop it from loading...
From the notes here I agree this feels like an upstream fix. But thanks for the steps to reproduce.
Sorry for seeming impatient (I'm not, just curious) but do you already know what you gonna do about it? Reporting this as a bug to the actual module that is affected or fix it by yourself? Since you added the 'help-wanted' tag, do you wait for somebody you know to help you out with this?
Hi @mspl13 I am not actively looking at this at the moment. Hence the help wanted tag. I would love somebody to help with this if they have the bandwidth.
Hi @No9 I faced issue with bad HTML as well ( "end" event does not get fired for bad HTML) , I ended up adding this code to fix it: Would you let me create a PR with a change similar to the following? (need to get the Max timeout from configuration). Thanks.
let endEmittedInterval;
let endEmittedInterval;
tr.on('finish', () => {
let intervalCount = 0;
endEmittedInterval = setInterval(() => {
try {
intervalCount += 1;
// make sure read and write are completed after finish
if (tr._writableState.ended === true && tr._tokenize._readableState.endEmitted === true) {
clearInterval(endEmittedInterval);
_end.call(res);
} else if (intervalCount >= 6000) {
// clear the runway interval after 60 seconds
// TODO: change 60 seconds to server's MAXIMUM_TIMEOUT value
clearInterval(endEmittedInterval);
_end.call(res);
}
} catch (err) {
clearInterval(endEmittedInterval);
_end.call(res);
}
}, 10);
});
Hi @chandransuresh Thanks for your message.
Do you know if this patch to html-select fix your issue? https://github.com/substack/html-select/pull/23
If it does we might want to fork that along with trumpet and publishing them in there own namespace?
If not then we can consider your workaround above and putting it into the harmon constructor something like
var opts = {};
opts.timeout = 6000;
app.use(require('../')([], selects, opts));
Let me know how you get on
Hi @No9 , Thanks for the response. I tested with the change in kouk:master branch from substack/html-select#23. No, it does not fix the issue of broken HTML. In my sample page, there are bunch of tags that does not have any matching close tags.
Hi @No9 , Sorry for the delay in getting back. Could you give me access to create a PR for malformed HTML fix?. The changed file is in the following clone. https://github.com/chandransuresh/harmon-malformed-html-fix/blob/fix-for-malformed-html/index.js
Thanks, Suresh
Depending on the source markup there are times when the trumpet pipeline will not finish processing even though the response has been written entirely (see below). It would be great if there was a way to set a timeout and cancel the pipeline somehow. For example:
If you run the example and
curl http://localhost:8001
it works fine. But if you uncomment the commented line and try again it hangs. I've tracked down the specific underlying reason for this in the html-select module, and I'll raise this specific issue with them. But what I'm interested in is shielding myself from these kinds of bugs (which may be triggered by broken proxied web pages) by doing setTimeout and aborting the trumpet pipeline somehow. I've tried various combinations ofres.end()
andtr.end()
but the stream fromtr.createStream
just never seems to die. I would appreciate if you could show me a way, perhaps even a totally different approach.