openPMD / openPMD-api

:floppy_disk: C++ & Python API for Scientific I/O
https://openpmd-api.readthedocs.io
GNU Lesser General Public License v3.0
134 stars 51 forks source link

adios2 streaming memory leak or incorrect usage? #1604

Closed stefurnic closed 3 months ago

stefurnic commented 3 months ago

I wish to stream data using ADIOS2 backend in a non-blocking way, minimal working example below. The memory usage keeps growing although I specified queue limit =1 and discard iterations when queue full. Am I using it wrong?

#include <openPMD/openPMD.hpp>
#include <algorithm>
#include <iostream>
#include <string>
#include <vector>
#include <memory>
#include <numeric>
#include<unistd.h>  // sleep(), for linux 

using std::cout;
using namespace openPMD;

int main(int argc, char* argv[]) {

    unsigned long i, j;
    std::vector<unsigned long> chunk;
    std::vector<std::vector<unsigned long>> chunks;

    // open file for streaming
    Series series = Series("samples/mwe.sst", Access::CREATE,
            R"(
            {
              "adios2": {
                "engine": {
                  "parameters": {
                    "DataTransport": "WAN",
                    "RendezvousReaderCount": "0",
                    "QueueLimit": "1",
                    "QueueFullPolicy": "Discard"
                  }
                }
              }
            })");

    unsigned long L=10000000; // data array length

    Datatype datatype = determineDatatype<unsigned long>();
    Dataset dataset = Dataset(datatype, {L});

    int N = 20; // No. of iterations

    for(i=0; i<N; i++) {

           cout << "Iteration: " << i << "\n";

           // prepare local data
           for(j=0; j<L; j++)
            chunk.push_back(j+i*L);

          Iteration it = series.writeIterations()[i];

          MeshRecordComponent mesh = 
              it.meshes["field"][MeshRecordComponent::SCALAR];
          mesh.resetDataset(dataset);
               mesh.storeChunk(chunk, {0}, {L});
               it.close();
               chunk.clear();
               sleep(1);
          }

    series.close();

    sleep(10);

    return 0;
}
// ...

Software Environment:

franzpoeschel commented 3 months ago

Can you try adding variable-based encoding to your setup?

{
  "iteration_encoding": "variable_based",
  "adios2": {
    "engine": {
      "parameters": {
        "DataTransport": "WAN",
        "RendezvousReaderCount": "0",
        "QueueLimit": "1",
        "QueueFullPolicy": "Discard"
      }
    }
  }
}

Apart from this, you will see a slight increase in memory usage over time as openPMD currently does not erase past Iterations. Better support for long-running setups is an active topic that we currently are working on, e.g. erasing past iterations will come with #1592.

stefurnic commented 3 months ago

The addition of "iteration_encoding": "variable_based" fixed the increasing memory usage on the receiving side (increasing with number of iterations). It also reduced the rate of increasing memory usage on the sender side but not zero. Thank you!

stefurnic commented 3 months ago

Closing as fixed.