Open Timh37 opened 1 year ago
@jbusecke I am revisting the old workflow here. Would be great if we could briefly discuss if/how this could be sped up in light of the many new datasets, during our meeting tomorrow!
Do you think this is solved by my work in https://github.com/jbusecke/CMIP6cex/tree/jbusecke_performance_regridding? Or should I make a PR for that? FYI, I also mentioned this in a pangeo discourse topic recently. Hopefully for now we get a hacky yet sufficiently performant solution, but maybe in the future there is a more satisfying way to handle this in general.
@jbusecke yes and no, I incorporated your work in https://github.com/Timh37/CMIP6cex/blob/main/cmip6_processing/testing/store_CMIP6_regridded_datasets.ipynb which works like a charm. However, I am struggling to edit the workflow to regrid to tide gauges: https://github.com/Timh37/CMIP6cex/blob/main/cmip6_processing/testing/store_CMIP6_datasets_at_tgs.ipynb. I tried both using fancy indexing and regridding to each tide gauge in a loop, but both are slow (and failing to connect to the rescheduler at some point?). If you have a chance to take a look that would be appreciated!
Looks like this is solved by reducing the batch size! Will confirm when I know more.
Sounds good. We can put in some more effort here after mid Nov if needed.
I keep getting errors along these lines:
/srv/conda/envs/notebook/lib/python3.10/site-packages/distributed/client.py:3141: UserWarning: Sending large graph of size 36.53 MiB.
This may cause some slowdown.
Consider scattering data ahead of time and using futures.
warnings.warn(
/srv/conda/envs/notebook/lib/python3.10/site-packages/distributed/client.py:3141: UserWarning: Sending large graph of size 33.03 MiB.
This may cause some slowdown.
Consider scattering data ahead of time and using futures.
warnings.warn(
Exception in callback None()
handle: <Handle cancelled>
Traceback (most recent call last):
File "/srv/conda/envs/notebook/lib/python3.10/site-packages/tornado/iostream.py", line 1367, in _do_ssl_handshake
self.socket.do_handshake()
File "/srv/conda/envs/notebook/lib/python3.10/ssl.py", line 1342, in do_handshake
self._sslobj.do_handshake()
ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self-signed certificate (_ssl.c:1007)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/srv/conda/envs/notebook/lib/python3.10/asyncio/events.py", line 80, in _run
self._context.run(self._callback, *self._args)
File "/srv/conda/envs/notebook/lib/python3.10/site-packages/tornado/platform/asyncio.py", line 192, in _handle_events
handler_func(fileobj, events)
File "/srv/conda/envs/notebook/lib/python3.10/site-packages/tornado/iostream.py", line 691, in _handle_events
self._handle_read()
File "/srv/conda/envs/notebook/lib/python3.10/site-packages/tornado/iostream.py", line 1454, in _handle_read
self._do_ssl_handshake()
File "/srv/conda/envs/notebook/lib/python3.10/site-packages/tornado/iostream.py", line 1385, in _do_ssl_handshake
return self.close(exc_info=err)
File "/srv/conda/envs/notebook/lib/python3.10/site-packages/tornado/iostream.py", line 606, in close
self._signal_closed()
File "/srv/conda/envs/notebook/lib/python3.10/site-packages/tornado/iostream.py", line 636, in _signal_closed
self._ssl_connect_future.exception()
asyncio.exceptions.CancelledError
2023-10-24 22:17:33,777 - distributed.client - ERROR - Failed to reconnect to scheduler after 30.00 seconds, closing client
/srv/conda/envs/notebook/lib/python3.10/site-packages/distributed/client.py:3141: UserWarning: Sending large graph of size 42.76 MiB.
This may cause some slowdown.
Consider scattering data ahead of time and using futures.
warnings.warn(
I think I will hold off from updating the workflow with the speed improvements until we can figure this out after mid November.
As discussed, @jbusecke, while it doesn't have a high priority at the moment, it'd be good to see if the main infrastructure of processing the CMIP6 data and deriving changes in extremes from the simulations can be made more efficient. This may be a good place to start.