nodejs / node

Node.js JavaScript runtime ✨🐢🚀✨
https://nodejs.org
Other
107.42k stars 29.52k forks source link

http2 client request doesn't always fire end event #31890

Open NeoPhi opened 4 years ago

NeoPhi commented 4 years ago

What steps will reproduce the bug?

const http2 = require('http2');
const client = http2.connect('https://prod.geo-fsty.philo.com');
const req = client.request({ ':path': '/' });
let data = '';
req.on('data', chunk => { data += chunk; });
req.on('end', () => {
  console.log(`END: ${data}`);
  client.close();
});
req.end();

How often does it reproduce? Is there a required condition?

This is reproducible every time. Other HTTP/2 clients behave correctly, such as: curl --http2 -v https://prod.geo-fsty.philo.com

What is the expected behavior?

The end event is fired.

What do you see instead?

The end event does not fire. The complete data payload is received.

Additional information

If the code is modified to wait for the ClientHttp2Session to be ready and delay issuing the request via setImmediate, it works.

const http2 = require('http2');
const client = http2.connect('https://prod.geo-fsty.philo.com', () => {
  setImmediate(() => {
    const req = client.request({ ':path': '/' });
    let data = '';
    req.on('data', chunk => { data += chunk; });
    req.on('end', () => {
      console.log(`END: ${data}`);
      client.close();
    });
    req.end();
  });
});
mnini commented 4 years ago

Well, that thing is utter mess...

You are binding, with callback, but you are using binded variable already in callback?...

Also, it's expected...

const http2 = require('http2')
const download = function(root, sublinks) {
    return {
        root: root,
        URLs: sublinks,
        recieved: {},
        add: function(URL,content) {
            this.recived[URL].add(content)
        },
        *[Symbol.iterator] () {
            var position = 0
            yield (callback)=>{
                var URL = this.URLs[position]
                req = this.client.request(':path', URL)
                var fileType
                req.setEncoding('uft8')
                req.on('response', (headers, flags)=>{
                    console.log(headers)
                    if(!headers.youWantThisFile) {
                        return req.end()
                    }
                    this.setEncoding(headers['Encoding'])
                    this.recieved[URL] = {
                        content = '',
                        add: function(x){
                            this.content += x
                        },
                        type: headers['Content-Type']
                    }
                })         
                req.on('data', (chunk)=>{
                    this.recieved[URL].add(chunk)
                })
                req.on('end', ()=>{
                    callback(this.URLs[position], this.recived[this.URLs[position]].lenght>?)                    
                })
            }        
        },
        handle: function(callback){
            this.client = http2.connect(this.root)
            for(link in this) {
                link(callback)
            }
        }
    }
}

download('random.org', [
    '/favico.png',
    '/robots.txt',
    '/sitemap.xml'

]).handle( (sublink, content)=>{

    console.log('Downloaded['+sublink+']:' + content.lenght)

})

It should work like this, but I recommend different approach, passing "request" to the callback.

You are actually firing .end() before request has option to fire it in that case, because what happens at event isn't sequential to when you bind it, and your code is executing "req.end()" before any event is emitted, therefore you close your client, before "on data" happens.

NeoPhi commented 4 years ago

@mnini My first code example, which does not work, is based directly on the HTTP2 Client-side Example in the documentation: https://nodejs.org/docs/latest-v12.x/api/http2.html#http2_client_side_example

The fact that introducing arbitrary delays into the request processing resolves the bug leads me to think that there is some subtle race condition in the internal implementation.

My use of .end() is unrelated to the end event which should be firing in the first example and does if you replace the authority with another HTTP/2 example such as www.google.com.

GrosSacASac commented 4 years ago

Should the doc or the code be fixed ? Or both ?

mnini commented 4 years ago
/// Edited: I know it doesn't matter, but here, return Object.assign(new Promise(), {}) should be used instead, I know it's just example but I have a strong OCD.
const download = function(root, sublinks) {
    return {
        mainURL: 'https://'+root',
        URLs: sublinks,
        recieved: {},
        *[Symbol.iterator] () {
            while(this.URLs.length>0){                
                yield this.download(this.URLs.pop())
            }
        },
        download: function(URL, dURL = null, setRecieved = (recieved)=>{return Object.assign(this.recieved[URL], recieved)}) {
            return new Promise((res, rej)=>{
                var timeout = setTimeout(()=>{
                    rej(Symbol.for('http2.client.timeout'))
                }, config.http2.client.timeout.beforeResponse )
                if(dURL==null) dURL = URL
                req = this.client.request({[HTTP2_HEADER_PATH]: dURL})
                req.setEncoding('UTF8')
                req.on('response', (headers, flags)=>{
                    console.log()
                    console.log('Download URL:')
                    console.log(dURL)
                    timeout.refresh()
                    this.recieved[URL] = {}
                    if(headers['server']==='Apache') 
                        {   req.on('data', ()=>{               
                        })
                            req.end()
                        }
                    var STATUS = headers[HTTP2_HEADER_STATUS] || headers[':status']          
                    if(STATUS>0) {
                        setRecieved( {
                            URL: URL,
                            status: STATUS,
                            content: '',
                            add: function(x){
                                this.content += x
                            },
                            type: null || headers['Content-Type'],
                            lang: null || headers['Content-Language'],
                            encoding: null || headers['Content-Encoding'],
                            lenght: null || headers['Content-Lenght']
                        })
                    } else {
                        setRecieved( {                            
                            URL: URL,
                            status: STATUS,
                        } )
                    }
                    if(STATUS===200) {                                    
                        req.on('data', (chunk)=>{
                            this.recieved[URL].add(chunk)
                            timeout.refresh()
                        })
                    }
                    if(STATUS===404) {
                        return rej({
                            sts:Symbol.for('http.client.request.rejected'),
                            for:Symbol.for('http.error.404'),
                            obj:this.recieved[URL]
                        })
                    }
                    if(STATUS===301) {
                        var currentLocation = headers['content-location'] || headers['location']

                        console.log('Redirect!')

                        if(!(this.recieved[URL].redirect instanceof Array)) this.recieved[URL].redirect = []
                        this.recieved[URL].redirect.push(currentLocation)
                        if(this.recieved[URL].lenght > config.http2.client.maxRedirect ) rej(Symbol.for('http2.client.maxRedirect'))
                        this.download(URL, currentLocation, (recieved)=>{
                            return Object.assign(this.recieved[URL], recieved)
                        }).then(
                            (resd)=>{
                                res(resd)
                            },
                            (rejd)=>{
                                rej(rejd)
                            }
                        )
                    }
                    console.log('End of req.on(request)')
                })
                req.on('end', ()=>{
                    console.log('ENDING')
                    res({
                        sts:Symbol.for('http.client.request.end'),
                        obj:this.recieved[URL]
                    })           
                })
            })
        },     
        then: function(callbackResolved, callbackRejected) {
            this.client = http2.connect(this.mainURL)
            this.loads  = []
            for(link of this) {
                this.loads.push(link.then(callbackResolved, callbackRejected))
            }
        }
    }
}

download('http2.akamai.com', [
    '/demo',
    '/non-existing-link'

]).then( (good)=>{
    console.log('Download: [finished]')
    console.log(good)
}, (bad)=>{
    console.log('Download: [failed]')
    console.log(bad)
})

Workaround: returning in req.on(response) can prevent from future lifetime, therefore if not onData set there, on data nothing happens, test are needed to be done if it's downloaded

This is my example on how to use web-server, you can throw request away by rejecting promise, before setting up on data hook.

I want to trust node developers, that simple solution to this, is that without "on-data"(but I am not sure of how does it works for real), the request wouldn't even download, than it's simplest way, but somehow hidden, the resolution of this.

Request lifetime is: ->response ->data ->end ->cancelStream

Therefore you can bind onData in the response, if you feel like deciding on the headers...

I don't believe that use-case when you data dependently decide, if to quit streaming http, based on http.

However, if you really decide to pause audio or video or file streaming from http2 in node, I suggest following things.

as I mentioned before, raw and awaited finish needs different callbacks

req.end() seems to fire end event, and if you pass data to it, ~ERR_WRITE_AFTER_REQUEST_END req.close() seems to be another candidate, but then, next on data tick triggers ~ERR_HTTP2_STREAM_CANCEL

mnini commented 4 years ago

Should the doc or the code be fixed ? Or both ?

Definitely code, one does need method to cancel set up request in middle of data streaming...

mnini commented 4 years ago

Is it status === 200, when it doesn't fire? For me it doesn't fire, when it has bad request status code mostly.

NeoPhi commented 4 years ago

I'm not seeing the end event regardless of the status code.

mnini commented 4 years ago

@NeoPhi I have end event firing there when having status 200.