beatrixparis / connectivity-modeling-system

The CMS is a multiscale stochastic Lagrangian framework developed by Paris' Lab at the Rosenstiel School of Marine, Atmospheric & Earth Science to study complex behaviors, giving probabilistic estimates of dispersion, connectivity, fate of pollutants, and other Lagrangian phenomena. This repository facilitates community contributions to CMS modules
https://beatrixparis.github.io/connectivity-modeling-system/
GNU General Public License v3.0
31 stars 25 forks source link

Getdata querying .dds/.das repeatedly #17

Open salwood82 opened 7 years ago

salwood82 commented 7 years ago

Some feedback from Micheal McDonald at HYCOM re getdata:

I see that the code might be unnecessarily querying the same .dds/.das [Dataset Descriptor Structure (DDS) & Dataset Attribute Structure" (DAS)] repeatedly. Note: this does not change, ever, as it is a reanalysis product. So from a network perspective, caching this .dds/.das in an array locally once upon dataset open would cut down on a lot of unnecessary client->server->client bandwidth for each query.

Not sure if this is something that can be changed whilst still making the code general for a range of data sources?

salwood82 commented 6 years ago

MORE INFO: getdata first opens a network connection and makes 5 queries; 1 to get the data on depth, lat, time and tau, 1 to get the data on lon (why is this done separately, for offset grids?), 2 for dds (why 2?) and 1 for das.

For each nest file downloaded it then opens a network connection for each variable to download (e.g. 1 for uvel and 1 for vvel), one after the other, and makes again 5 queries for each; 1 for the variable, 1 for depth, lat, time and tau (this seems unnecessary), 1 for das and 2 for dds (which as Micheal suggests may be unnecessary).

So while it only ever has 1 network connection open at once, it is making a lot of (unnecessary?) queries = a lot of bandwidth required, which slows the download down and I think might also be causing it to crash often:

e.g.:

strace -s 120 -e sendto ./getdata test 1
 fill_value and velocity_conversion_factor on read  -30.00000       1.000000

 Opening file for reading:
 http://tds.hycom.org/thredds/dodsC/GLBu0.08/expt_19.1/2012/3hrly
sendto(3, "GET /thredds/dodsC/GLBu0.08/expt_19.1/2012/3hrly.**dds** HTTP/1.1\r\nUser-Agent: oc4.4.1.1\r\nHost: tds.hycom.org\r\nAccept: */*\r\n"..., 122, MSG_NOSIGNAL, NULL, 0) = 122
sendto(3, "GET /thredds/dodsC/GLBu0.08/expt_19.1/2012/3hrly.**das** HTTP/1.1\r\nUser-Agent: oc4.4.1.1\r\nHost: tds.hycom.org\r\nAccept: */*\r\n"..., 122, MSG_NOSIGNAL, NULL, 0) = 122
sendto(3, "GET /thredds/dodsC/GLBu0.08/expt_19.1/2012/3hrly.**dds** HTTP/1.1\r\nUser-Agent: oc4.4.1.1\r\nHost: tds.hycom.org\r\nAccept: */*\r\n"..., 122, MSG_NOSIGNAL, NULL, 0) = 122
sendto(3, "GET /thredds/dodsC/GLBu0.08/expt_19.1/2012/3hrly.dods?**depth,lat,time,tau** HTTP/1.1\r\nUser-Agent: oc4.4.1.1\r\nHost: tds.hyco"..., 142, MSG_NOSIGNAL, NULL, 0) = 142
sendto(3, "GET /thredds/dodsC/GLBu0.08/expt_19.1/2012/3hrly.dods?**lon** HTTP/1.1\r\nUser-Agent: oc4.4.1.1\r\nHost: tds.hycom.org\r\nAccept: "..., 127, MSG_NOSIGNAL, NULL, 0) = 127

 Succesfully read data: Longitude
 Succesfully read data: Latitude
 Succesfully read data: Depth
 Succesfully read data: Time
 Starting X Axis index:         2201
 Ending X Axis index  :         2258
 Starting Y Axis index:         1676
 Ending Y Axis index  :         1751
 Starting Z Axis index:            6
 Ending Z Axis index  :            6
 Starting T Axis index:          481
 Ending T Axis index  :         1458
 Error: CMS can only handle regular time steps between data.
 The field with the name time has a irregular time step between the data:
   109560.0      and    109584.0

 Reading data file            1  of          978
sendto(3, "GET /thredds/dodsC/GLBu0.08/expt_19.1/2012/3hrly.**dds** HTTP/1.1\r\nUser-Agent: oc4.4.1.1\r\nHost: tds.hycom.org\r\nAccept: */*\r\n"..., 122, MSG_NOSIGNAL, NULL, 0) = 122
sendto(3, "GET /thredds/dodsC/GLBu0.08/expt_19.1/2012/3hrly.**das** HTTP/1.1\r\nUser-Agent: oc4.4.1.1\r\nHost: tds.hycom.org\r\nAccept: */*\r\n"..., 122, MSG_NOSIGNAL, NULL, 0) = 122
sendto(3, "GET /thredds/dodsC/GLBu0.08/expt_19.1/2012/3hrly.**dds** HTTP/1.1\r\nUser-Agent: oc4.4.1.1\r\nHost: tds.hycom.org\r\nAccept: */*\r\n"..., 122, MSG_NOSIGNAL, NULL, 0) = 122
sendto(3, "GET /thredds/dodsC/GLBu0.08/expt_19.1/2012/3hrly.dods?**depth,lat,time,tau** HTTP/1.1\r\nUser-Agent: oc4.4.1.1\r\nHost: tds.hyco"..., 142, MSG_NOSIGNAL, NULL, 0) = 142
sendto(3, "GET /thredds/dodsC/GLBu0.08/expt_19.1/2012/3hrly.dods?**water%5fu.water%5fu**[480][5][1675:1750][2200:2257] HTTP/1.1\r\nUser-A"..., 173, MSG_NOSIGNAL, NULL, 0) = 173
 Succesfully read data: U-velocity

sendto(3, "GET /thredds/dodsC/GLBu0.08/expt_19.1/2012/3hrly.**dds** HTTP/1.1\r\nUser-Agent: oc4.4.1.1\r\nHost: tds.hycom.org\r\nAccept: */*\r\n"..., 122, MSG_NOSIGNAL, NULL, 0) = 122
sendto(3, "GET /thredds/dodsC/GLBu0.08/expt_19.1/2012/3hrly.**das** HTTP/1.1\r\nUser-Agent: oc4.4.1.1\r\nHost: tds.hycom.org\r\nAccept: */*\r\n"..., 122, MSG_NOSIGNAL, NULL, 0) = 122
sendto(3, "GET /thredds/dodsC/GLBu0.08/expt_19.1/2012/3hrly.**dds** HTTP/1.1\r\nUser-Agent: oc4.4.1.1\r\nHost: tds.hycom.org\r\nAccept: */*\r\n"..., 122, MSG_NOSIGNAL, NULL, 0) = 122
sendto(3, "GET /thredds/dodsC/GLBu0.08/expt_19.1/2012/3hrly.dods?**depth,lat,time,tau** HTTP/1.1\r\nUser-Agent: oc4.4.1.1\r\nHost: tds.hyco"..., 142, MSG_NOSIGNAL, NULL, 0) = 142
sendto(3, "GET /thredds/dodsC/GLBu0.08/expt_19.1/2012/3hrly.dods?**water%5fv.water%5fv**[480][5][1675:1750][2200:2257] HTTP/1.1\r\nUser-A"..., 173, MSG_NOSIGNAL, NULL, 0) = 173
 Succesfully read data: V-velocity