galaxyproject / pulsar

Distributed job execution application built for Galaxy
https://pulsar.readthedocs.io
Apache License 2.0
37 stars 50 forks source link

REST connections break with Py3 using standard transport #227

Closed AndreasSko closed 4 years ago

AndreasSko commented 4 years ago

After upgrading Galaxy (20.01) to Python 3 I noticed that Pulsar would close a connection when transfering data for no reason, resulting in a "broken pipe" on the Galaxy side. In other occasions, after a "successful" transfer, Pulsar would just store an empty dataset. After debugging a bit with tcpdump, I noticed that when running Galaxy with Python 3, it would send chunked requests (Transfer-Encoding: chunked), which Pulsar apparently does not like. However, I did not find a way to deactivate this, as the Content-Length is already set (which normally should indicate to urlib, that the requests should NOT get chunked). Maybe it has something to do with the way the file is handled using mmap? I fixed the problem for me by switching the transport mode to curl, but I thought it might still be helpful to open this issue.

Attached are the tcpdump from Galaxy with Python 2 and Python 3 (can be opened with Wireshark): py2.txt py3.txt

hexylena commented 4 years ago

@mvdbeek/@nsoranzo/@natefoo and I noticed the same issue during admin training. Didn't we open an issue somewhere?

natefoo commented 4 years ago

I wish I'd found this before spending a couple days debugging the same issue! Excellent detective work, and thanks @AndreasSko. This is fixed in #231, and will be released in Pulsar 0.14.0.