llohse / libnpy

C++ library for reading and writing of numpy's .npy files
MIT License
373 stars 71 forks source link

Allow for memory reuse by reading into already existing buffers #41

Open kstppd opened 1 month ago

llohse commented 1 month ago

Thank you for your contribution. I cannot quite figure out what the use case of this interface is, though. Could you elaborate a bit? Are you thinking of something specific?

kstppd commented 1 month ago

Hi there, I am using this lib in a raymarcher of mine and I need to read a timeliness of files which are in the hundreds. Since all the datasets are smae size I thought it would be good to have an interface allows the usage of already allocated memory. The same buffer is used in a loop to read in files instead of constantly mallocing a new one.

llohse commented 1 month ago

Yes, such an interface definitely makes sense. I am a bit hesitant with the one you suggest, though. It makes perfect sense for your specific use case but is still quite restrictive because it requires the data to be owned by an npy_data object. If the data is supposed to go into a preallocated buffer, you still need to copy it.

How about an interface where you just specify a view/pointer to the buffer?

I can think of 3 interfaces:

  1. inline npy_data_ptr<Scalar> read_npy(std::istream &in, *Scalar dst_buf, size_t count) I don't particularly like this because of the raw pointer.

  2. inline npy_data_ptr<Scalar> read_npy(std::istream &in, std::span<Scalar> dst_buf) This is very clean but requires c++20, which I don't want to require.

  3. inline npy_data_ptr<Scalar> read_npy(std::istream &in, It dst_buf, size_t count) This seems like a good compromise? I have made a pull request in #43

It could be used like this:

#include "npy.hpp"
#include <vector>
#include <string>

int main() {
  std::vector<double> data;
  data.resize(1024);

  const std::string path{"in.npy"};
  auto  d = npy::read_npy<double,data::iterator>(path, d.begin(), d.size());
}

What do you think?