TGSAI / mdio-cpp

C++, Cloud native, scalable storage engine for various types of energy data.
Apache License 2.0
6 stars 3 forks source link

Added prune and trim differentiation #115

Closed BrianMichell closed 2 months ago

BrianMichell commented 2 months ago

Differentiate between metadata trim and actual data trim, which could be slow even if no data actually gets deleted.

blasscoc commented 2 months ago

Can we have a Macro to do the open and assign. Also, since we return a Future, can this be async?

Somehting like ASSIGN OR RETURN

auto dsRes = mdio::Dataset::Open(dataset_path, mdio::constants::kOpen); if (!dsRes.status().ok()) { return dsRes.status(); } mdio::Dataset ds = dsRes.value();

Is there a macro for this pattern, it's fine but I makes for more reading.

  if (!spec.status().ok()) {
    // Something went wrong with Tensorstore retrieving the spec
    return spec.status();
  }
  auto specJsonResult = spec.value().ToJson(IncludeDefaults{});
  if (!specJsonResult.status().ok()) {
    return specJsonResult.status();
  }
blasscoc commented 2 months ago

I'd suggest we can have utils folder e.g. tensorstore/utils if that makes sense.

but keep the naming convention for the file name.

So trim_dataset.h and delete_dataset.h

trim_dataset_test.cc etc

BrianMichell commented 2 months ago

This is a weird behavior, like why leave data if it cannot be accessed? It can technically be recovered by re-expanding the metadata. This is also a much quicker operation than the delete_sliced_out_chunks because it only operates on the metadata.

Its intention is much more towards HPC style applications that may not know the extent before anything gets written.