OPENDAP / olfs

The front end for the Hyrax Data Server
GNU Lesser General Public License v2.1
5 stars 8 forks source link

Firefox fails to download files every second try with hyrax 1.16.5 #145

Open samoylovfp opened 2 years ago

samoylovfp commented 2 years ago

Steps to reproduce:

Empty file is downloaded every second try

image

There are some errors in /var/log/tomcat/console.log of the container:

2022-03-09T14:32:43.724 +0000 [172.17.0.1] [Mozilla#5#0 #X11# Linux x86_64# rv#98#0# Gecko#20100101 Firefox#98#0] [thread:http-bio-8080-exec-8] [76842][8] [HTTP-GET] [/opendap/hyrax/file.nc.dap.nc4] [dap4.ce=#somevar#5B0:1:2#5D#5B0:1:2#5D] [] [] ERROR - opendap.bes.BES - Problem encountered with BES connection. Message: 'Failed to write to socket: Broken pipe (Write failed)' OPeNDAPClient executed 7 prior commands.
2022-03-09T14:32:47.656 +0000 [172.17.0.1] [Mozilla#5#0 #X11# Linux x86_64# rv#98#0# Gecko#20100101 Firefox#98#0] [thread:http-bio-8080-exec-3] [80774][12] [HTTP-GET] [/opendap/hyrax/file.nc.dap.nc4] [dap4.ce=#somevar#5B0:1:2#5D#5B0:1:2#5D] [] [] ERROR - opendap.bes.BES - Problem encountered with BES connection. Message: 'Failed to write to socket: Broken pipe (Write failed)' OPeNDAPClient executed 2 prior commands.

Works fine with chromium and when running hyrax:1.16.4

ndp-opendap commented 2 years ago

This is likely a problem with the BES as indicated by the console log message:

ERROR - opendap.bes.BES - Problem encountered with BES connection. Message: 'Failed to write to socket: Broken pipe (Write failed)' OPeNDAPClient executed 7 prior commands.

Typically this message indicates that the connected beslistener process has died. Can you please look at the associated BES log in /var/bes/bes.log to see if there are errors being logged there?

samoylovfp commented 2 years ago

For empty files it seems to only log "request", nothing about writing the files. Also weird that this does not happen with chromium, will see if there is any difference in the exchange

2022-03-09T14:31:41 GMT|&|103|&|request|&|get.dap|&|netcdf-4|&|/usr/share/hyrax/file.nc|&|/somevar[0:1:2][0:1:2]
2022-03-09T14:32:43 GMT|&|108|&|request|&|get.dap|&|netcdf-4|&|/usr/share/hyrax/file.nc|&|/somevar[0:1:2][0:1:2]
2022-03-09T14:32:43 GMT|&|108|&|info|&|BESUtil::file_to_stream() - Using ostream: 0x7ffe27fa7b20 cout: 0x60e6c0
2022-03-09T14:32:43 GMT|&|108|&|info|&|BESUtil::file_to_stream() - Sent 10843 bytes from file '/tmp/ncM1iWVV'.
2022-03-09T14:32:45 GMT|&|108|&|request|&|get.dap|&|netcdf-4|&|/usr/share/hyrax/file.nc|&|/somevar[0:1:2][0:1:2]
2022-03-09T14:32:47 GMT|&|109|&|request|&|get.dap|&|netcdf-4|&|/usr/share/hyrax/file.nc|&|/somevar[0:1:2][0:1:2]
2022-03-09T14:32:47 GMT|&|109|&|info|&|BESUtil::file_to_stream() - Using ostream: 0x7ffe27fa7b20 cout: 0x60e6c0
2022-03-09T14:32:47 GMT|&|109|&|info|&|BESUtil::file_to_stream() - Sent 10843 bytes from file '/tmp/ncIyv5uZ'.
samoylovfp commented 2 years ago

Strace from beslistener with comments

strace: Process 403227 attached
select(8, [7], NULL, NULL, {tv_sec=79, tv_usec=295847}

<Requesting a file>
) = 1 (in [7], left {tv_sec=64, tv_usec=760005})
accept(7, {sa_family=AF_INET, sin_port=htons(53442), sin_addr=inet_addr("127.0.0.1")}, [16]) = 8
read(8, "PPTCLIENT_TESTING_CONNECTION", 64) = 28
write(8, "PPTSERVER_CONNECTION_OK", 23) = 23
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fdb8681eb90) = 252
close(8)                                = 0
rt_sigprocmask(SIG_BLOCK, [HUP PIPE TERM CHLD], NULL, 8) = 0
rt_sigprocmask(SIG_UNBLOCK, [HUP PIPE TERM CHLD], NULL, 8) = 0
select(8, [7], NULL, NULL, {tv_sec=120, tv_usec=0}
<Downloaded normal file>
<Requesting again>
) = ? ERESTARTNOHAND (To be restarted if no handler)
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_DUMPED, si_pid=252, si_uid=999, si_status=SIGSEGV, si_utime=0, si_stime=1} ---
rt_sigreturn({mask=[]})                 = -1 EINTR (Interrupted system call)
rt_sigprocmask(SIG_BLOCK, [HUP PIPE TERM CHLD], NULL, 8) = 0
wait4(0, [{WIFSIGNALED(s) && WTERMSIG(s) == SIGSEGV && WCOREDUMP(s)}], WNOHANG, NULL) = 252
wait4(0, 0x7ffe27fa81d0, WNOHANG, NULL) = -1 ECHILD (No child processes)
rt_sigprocmask(SIG_UNBLOCK, [HUP PIPE TERM CHLD], NULL, 8) = 0
select(8, [7], NULL, NULL, {tv_sec=120, tv_usec=0}

<Got empty file>
samoylovfp commented 2 years ago

https://gist.github.com/samoylovfp/2fe8467685e5612461de4fa37233a055 chromium and firefox requests and responses

ndp-opendap commented 2 years ago

Hi @samoylovfp,

Are you running this from a hyrax docker image you pulled from DockerHub (opendap/hyrax:1.16.5)? If so, would you be willing to try running this test against opendap/hyrax:snapshot ? We are pretty far down the road from 1.16.5 so it would be most beneficial to know if the problem persists in the code base before I start chasing this rabbit :)

If not, I'll take it on when I can get to it in the next few days.

Thanks!!

samoylovfp commented 2 years ago

Will do in an hour, thanks for taking your time to take a look!

samoylovfp commented 2 years ago

Still happens with [1.16.5-154]