ekalinin / nodeenv

Virtual environment for Node.js & integrator with virtualenv
http://ekalinin.github.io/nodeenv/
Other
1.7k stars 209 forks source link

download_node_src does not properly handle multipart downloads #324

Open kcdodd opened 1 year ago

kcdodd commented 1 year ago

The method download_node_src fails if the download doesn't complete in a single part. In my case this lead to a seemingly unrelated error AttributeError: 'bytes' object has no attribute 'tell' deep in the tarfile module. The exception handler,

    try:
        dl_contents = io.BytesIO(urlopen(node_url).read())
    except IncompleteRead as e:
        logger.warning('Incomplete read while reading'
                       'from {}'.format(node_url))
        dl_contents = e.partial

assigned a bytes object to dl_contents instead of a BytesIO. However, updating the exception still did not work because "partial" really does mean partial, and is not the complete file so there is no way to use this. Also, simply calling read() again also appears not to be the way to handle multipart.

I got this to work by using requests, which appears to handle this properly

    import requests
    dl_contents = io.BytesIO(requests.get(node_url).content)
bagerard commented 1 year ago

We are being affected by this issue as well, are maintainers ok to switch from urllib to requests? Alternative may be to do multiple attempts to download the file in case of IncompleteRead errors

For future ref - potentially related issue here

fruch commented 1 year ago

We are getting hit by the same issue. Inside precommit.

Switching to requests sounds reasonable to me.

hynek commented 1 year ago

Is there anything that makes requests preferable over urllib3 (that requests depends on)?

fruch commented 1 year ago

seem like https://github.com/ekalinin/nodeenv/pull/329 wasn't enough to fix ths issue

we are still get hit by it from time to time: https://github.com/scylladb/scylla-cluster-tests/pull/6559#issuecomment-1701031041

@hynek if request would know to handle downloading a multi part file out of the box better then urllib3, it good enough reason if it would be my code.

bagerard commented 1 year ago

I could not reach a point where I was able to reproduce the issue consistently so I can't confirm that the issue is related with multipart download, I would expect that to be easily reproducible. So it's unclear if request would actually fix it. I believe network glitches are causing this as explained here

bagerard commented 1 year ago

FYI https://github.com/nodejs/build/issues/1993 Issue is closed but it's still receiving comments

jaklan commented 11 months ago

We also affected by that when using node hooks in pre-commit.

fruch commented 1 month ago

maybe there a different place the node binary can be retrieved from ? mirrors or something like that ?