pelkmanslab / TissueMAPS-OLD

Old TissueMAPS code (until 2016); please use pelkmanslab/TissueMAPS instead.
http://github.com/pelkmanslba/TissueMAPS
0 stars 0 forks source link

Tm_Client fails when downloading bigger dataset #55

Open jluethi opened 7 years ago

jluethi commented 7 years ago

Whenever you download something bigger with TmClient, the downloaded data gets meaningless. While the data seems to be correct in the database, the output of tmclient download is nonsensical. A few examples:

As long as you just download a small results set, it’s all fine, the values are reasonable. As soon as you download something bigger, you run into those problems.

I’m currently trying to narrow down "big" vs. "small" datasets, here’s what I got so far:

I think that was also the problem we had with Andi’s data two weeks ago. I suspect that that bug was never actually fixed, we just haven’t tested such big dataset since then.

My suspicion is that when things that are too big need to be streamed via command line, things become a mess. Maybe it’s something related to the limitations of piping described here? https://superuser.com/questions/554855/how-can-i-fix-a-broken-pipe-error

hackermd commented 7 years ago

Can you please try to insert print values before this line to check whether the correct values are send by the server.

Maybe the problem is related to compression, which only get's applied upon a certain size limit, e.g. by nginx, and client side decompression/decoding. Set a print res before this line, to see how the full response looks like client side.

You can also use the dev servers to test whether this is related to nginx:

# application server
tm_server -vv

# web server
cd ~/tmui/src
gulp
hackermd commented 7 years ago

The dowloaded data gets now directly streamed to files on disk client side: TissueMAPS/TmClient@45412f00233efb8c87dd8ea845e90953a62f1609

In addition, data get's chunked for streaming more efficiently: TissueMAPS/TmServer@c55c1686fc3b6c2adb6752ccc1d8b70af65a1e1d

With this I was able to efficiently stream a dataset (via NGINX/uWSGI). However, at some point the download got stuck, i.e. no additional rows were appended to the CSV file. I found the following error in the uwsgi log (the nginx error log is clean and no HTTP error was received client side):

Wed Mar 29 12:46:29 2017 - uwsgi_response_write_body_do(): Broken pipe [core/writer.c line 419] during GET /api/experiments/dG1hcHMxMA==/mapobject_types/dG1hcHMyMQ==/feature-values (46.127.159.87)
IOError: write error
[pid: 15006|app: 0|req: 1/8] 46.127.159.87 () {38 vars in 806 bytes} [Wed Mar 29 10:19:21 2017] GET /api/experiments/dG1hcHMxMA==/mapobject_types/dG1hcHMyMQ==/feature-values => generated 71488271 bytes in 8827958 msecs (HTTP/1.1 200) 2 headers in 140 bytes (1368 switches on core 99)
[pid: 15006|app: -1|req: -1/9]  () {0 vars in 0 bytes} [Wed Mar 29 12:47:15 2017]   => generated 0 bytes in 0 msecs ( 0) 0 headers in 0 bytes (1 switches on core 99)
[pid: 15001|app: 0|req: 2/10] 47.203.95.65 () {36 vars in 674 bytes} [Wed Mar 29 12:47:28 2017] GET http://httpheader.net/ => generated 233 bytes in 27 msecs (HTTP/1.1 404) 2 headers in 72 bytes (3 switches on core 99)
uwsgi_proto_http_parser() -> client closed connection
[pid: 15013|app: -1|req: -1/11]  () {0 vars in 0 bytes} [Wed Mar 29 12:47:29 2017]   => generated 0 bytes in 10011 msecs ( 0) 0 headers in 0 bytes (2 switches on core 99)
uwsgi_proto_http_parser() -> client closed connection
[pid: 15014|app: -1|req: -1/12]  () {0 vars in 0 bytes} [Wed Mar 29 12:47:39 2017]   => generated 0 bytes in 10010 msecs ( 0) 0 headers in 0 bytes (2 switches on core 99)
uwsgi_proto_http_parser() -> client closed connection
[pid: 15007|app: -1|req: -1/13]  () {0 vars in 0 bytes} [Wed Mar 29 12:47:50 2017]   => generated 0 bytes in 10010 msecs ( 0) 0 headers in 0 bytes (2 switches on core 99)
hackermd commented 7 years ago

Potential culprits on the side of NGINX:

Setting all of these parameters to 600 solved the problem.

hackermd commented 7 years ago

With the above settings, I got Timeout !!! instead of Broken pipe.

I then made the settings uwsgi-specific

but I got the same error again.

hackermd commented 7 years ago

There are additional timeout options on the side of uWSGI:

$ uwsgi --help | grep timeout
    -t|--harakiri                          set harakiri timeout
    --mule-harakiri                        set harakiri timeout for mule tasks
    -z|--socket-timeout                    set internal sockets timeout
    --spooler-harakiri                     set harakiri timeout for spooler tasks
    --wait-for-interface-timeout           set the timeout for wait-for-interface
    --wait-interface-timeout               set the timeout for wait-for-interface
    --wait-for-iface-timeout               set the timeout for wait-for-interface
    --wait-iface-timeout                   set the timeout for wait-for-interface
    --wait-for-fs-timeout                  set the timeout for wait-for-fs/file/dir
    --wait-for-socket-timeout              set the timeout for wait-for-socket
    --so-send-timeout                      set SO_SNDTIMEO
    --socket-send-timeout                  set SO_SNDTIMEO
    --so-write-timeout                     set SO_SNDTIMEO
    --socket-write-timeout                 set SO_SNDTIMEO
    --ssl-sessions-timeout                 set SSL sessions timeout (default: 300 seconds)
    --ssl-session-timeout                  set SSL sessions timeout (default: 300 seconds)
    --chunked-input-timeout                set default timeout for chunked input
    --ping-timeout                         set ping timeout
    --carbon-timeout                       set carbon connection timeout in seconds (default 3)
    --fastrouter-timeout                   set fastrouter timeout
    --http-timeout                         set internal http socket timeout
    --http-headers-timeout                 set internal http socket timeout for headers
    --http-connect-timeout                 set internal http socket timeout for backend connections
    --rawrouter-timeout                    set rawrouter timeout
    --sslrouter-timeout                    set sslrouter timeout

Increasing socket-timeout fixes the error for me.

jluethi commented 7 years ago

These timeouts sound interesting. Where do I change uwsgi and nginx options on chimp? Are there settings files I can edit? Would those edits apply to all uwsgi workers after a restart of uwsgi?

hackermd commented 7 years ago

We should probably change this approach entirely: Instead of trying to stream the entire dataset, it might be better to request the data for each site individually, i.e. send a separate HTTP request for each site. I will try this within the next days.

hackermd commented 7 years ago

I just realized that you can of course also directly connect to the application server via tmclient:

tm_client -H app.tissuemaps.org -P 8080 -u markus experiment ls

This should circumvent all problems related to communication between NGINX and uWSGI via the socket.