Closed emiliom closed 3 years ago
I was going to implement this yesterday but thought of a possible issue: in the case when a Dataset is large, by returning the Dataset itself as output from all SetGroup*.set_*
methods and save them all into file in, say, a save_all
method, it seems that we would be keeping the large Datasets around until saving them. Obviously one way to circumvent this to not have the save_all
method but create (using set_*
) --> save to file --> destroy the Dataset object.
The majority of data sits in the Beam
group that holds the backscatter data. The file size is something the users can choose. The largest file I've seen so far is ~300 MB from EK80.
Thoughts?
I've already forgotten the details of how this stuff works :disappointed: .... (though I did re-read the associated discussion, #225).
But, my main comment would be to not focus on this issue at this time. There's already plenty to do to get 0.5.0 out the door. This issue can wait.
In the new class-redesign branch, the
SetGroup*
classes and functionality perform two related but distinct functions: create the xarray dataset for each nc4 group, and save it to file (nc or zarr). Separating those functions would provide more flexibility to power users, without impacting regular users who useConvert
followed by the newto_netcdf
andto_zarr
methods to write to files.See https://github.com/OSOceanAcoustics/echopype/discussions/225 for more discussions on this.