Closed robin-cls closed 1 year ago
Hello,
While using the tutorial, I stumbled over a strange behavior.
If we replay the tutorial, but removing the cluster instanciation, the zcollection stays empty after the insert() step
Code to reproduce
from __future__ import annotations import datetime import pprint import dask.distributed import fsspec import numpy import zcollection import zcollection.tests.data def create_dataset(): """Create a dataset to record.""" generator = zcollection.tests.data.create_test_dataset_with_fillvalue() return next(generator) ds = create_dataset() ds.to_xarray() fs = fsspec.filesystem('memory') # Here, contrary to the tutorial, we do not instanciate a client # cluster = dask.distributed.LocalCluster(processes=False) # client = dask.distributed.Client(cluster) partition_handler = zcollection.partitioning.Date(('time', ), resolution='M') collection = zcollection.create_collection('time', ds, partition_handler, '/my_collection', filesystem=fs) collection.insert(ds) print(collection.load()) >> <Nothing>
Expected behavior
I was expected the insert() step to be effective even if we do not instanciate a cluster.
... collection.insert(ds) print(collection.load()) >> <zcollection.dataset.Dataset> >> Dimensions: ('num_lines: 61', 'num_pixels: 25') >>Data variables: >> time (num_lines datetime64[us]: dask.array<chunksize=(11,)> >> var1 (num_lines, num_pixels float64: dask.array<chunksize=(11, 25)> >> var2 (num_lines, num_pixels float64: dask.array<chunksize=(11, 25)> >> Attributes: >> attr : 1
zcollection version 2023.3.2
This is completely normal. If you use a memory file system, you have to use only one process, otherwise the Dask workers will write to the memory of the workers, but these are invisible for the main process.
Hello,
While using the tutorial, I stumbled over a strange behavior.
If we replay the tutorial, but removing the cluster instanciation, the zcollection stays empty after the insert() step
Code to reproduce
Expected behavior
I was expected the insert() step to be effective even if we do not instanciate a cluster.
zcollection version 2023.3.2