CCI-Tools / cate

ESA CCI Toolbox (Cate)
MIT License
50 stars 15 forks source link

Ingest SST and Soil Moisture Data from ODP #71

Closed forman closed 7 years ago

forman commented 8 years ago

Make sure SST and Soil Moisture Data from ODP is accessible and can be opened correctly.

JanisGailis commented 7 years ago
from cate.core.ds import DATA_STORE_REGISTRY
from cate.core.monitor import ConsoleMonitor
import cate.ops as ops

monitor = ConsoleMonitor()
data_store = DATA_STORE_REGISTRY.get_data_store('esa_cci_odp')
sm = ops.open_dataset('esacci.SOILMOISTURE.day.L3S.SSMV.multi-sensor.multi-platform.COMBINED.02-2.r1',
                       '2000-01-01',
                       '2003-12-31', sync=True, monitor=monitor)

Syncing doesn't work for soil moisture. No monitor output (neither using API nor CLI), files are not created in ~/.cate/data_stores/esa_cci_odp/xx. However, whem monitoring network traffic, incoming traffic jumps when trying to sync both on API and CLI, so it does seem to be downloading something.

SST seems to work as expected.

Both datasets are daily, we need monthly, so temporal aggregation (producing monthly mean) will have to be done.

kbernat commented 7 years ago

@JanisGailis, you should be able to open this data source now. There was a problem with fetching multiple responses from ESFG service, it's fixed now.

kbernat commented 7 years ago
cate ds sync esacci.SOILMOISTURE.day.L3S.SSMV.multi-sensor.multi-platform.COMBINED.02-2.r1 2001-01-01 2001-02-01
Sync esacci.SOILMOISTURE.day.L3S.SSMV.multi-sensor.multi-platform.COMBINED.02-2.r1: progress                              
32 of 32 file(s) synchronized.
JanisGailis commented 7 years ago

That was quick. I already fetched it manually, but I'll try if the syncing works as expected then!

JanisGailis commented 7 years ago

OK, I still get the same behavior. I'll investigate it further on Monday, maybe I missed something.

In the mean time, while opening SST data, I get OSError 'Too many open files', people at xarray are discussing it: https://github.com/pydata/xarray/issues/463

It's three years of daily data. So 1000+ files

JanisGailis commented 7 years ago

@kbernat I can confirm that synchronization of the soil moisture dataset now works. One just have to wait a bit before the process really starts and the output gets written to the monitor, I was apparently too impatient on Friday.

However, #102 is now a thing. This is not blocking for UC06 development as I will just use smaller datasets for that purpose.

JanisGailis commented 7 years ago

Datasets as they are on ODP can be opened. SST has quite high compression resulting in lots of 'in-memory' data, which will be solved by #118