Two writers in one I/O - Githubissues

germasch commented 5 years ago

The last commit of PR #1312 adds the following test:

    auto writer = this->m_Io.Open(filename, adios2::Mode::Write);
    auto writer2 = this->m_Io.Open("multi_put2writers2.bp", adios2::Mode::Write);
    auto var = this->m_Io.template DefineVariable<T>("var", this->m_Shape);

    MyData<T> myData(this->m_Selections);

    for (int b = 0; b < myData.NBlocks(); ++b)
    {
        this->PopulateBlock(myData, b);
        var.SetSelection(myData.Selection(b));
        writer.Put(var, &myData[b][0], TypeParam::PutMode);
        writer2.Put(var, &myData[b][0], TypeParam::PutMode);
    }
    writer2.Close();
    writer.Close();

The test is commented out in the PR because it produces invalid output.

Now, I know this use case is somewhat far-fetched. However:

Multiple engines per I/O are supported by the API, so I suppose if you let the app use adios2 this way, it should work correctly.
The test writes the same data into two files, which is not a realistic use case. However, in one of my apps, I calculate moments of the particle distribution. I write them into one output file as a snapshot just for the current time step, but I also calculate an average of the past n timesteps and write those smoothed moments into a separate file, so this pattern actually occurs in real life.

williamfgc commented 5 years ago

@germasch it's a valid use case, checkpoint and analysis data might write the same variable to different files. It's better practice to keep variables with their own IO/Engine (as done at read since Engine populates IO variables). We thought about making deep copies for IOs in DeclareIO, but no one has complained about defining its own variables per IO.

germasch commented 5 years ago

Sorry, I don't follow the part about making deep copies in DeclareIO, do you mean making deep copies of the IO in Engine::Open?

Anyway, that means a good amount of changes in the internal design are necessary, but I think that's a good thing because it'll separate things that were all co-mingled in one class. core::Variable contains user-facing general info about a variable that is independent of what (if any) Engine is being used. But it also accumulated a good amount of per-Engine (and maybe even Engine-private) information, like m_BlocksSpan and (with caveats) m_BlocksInfo. Separating that should make the code more maintainable (and fix this and similar bugs).

williamfgc commented 5 years ago

Making a deep copy of IO would allow you to solve this problem, but most use cases separate that case into two separate IOs (if you change one parameter it will affect both engines, e.g. checkpoint without metadata, while analysis with metadata). Each IO will have its own Engine/Variables. Personally, I don't think internal changes are necessary to address this kind of case as we still need Variable to be a bridge between the Engine and the user at Read, and will create a lot of confusion: Variable at Write is independent (which Span breaks BTW), but at Read is not?

This is a design decision since, as you pointed out, it's a valid (but rare) case since typically separation between IO tasks (so parameters, variables, engines) is done at the IO level for flexibility and independence between Engines and their settings. That's the whole point of the IO class. Which also applies at the runtime config file (Variables/Engines must be inside an IO).

germasch commented 5 years ago

I'm not suggesting to change the overall design of the API. Variables would remain under IO, as they are now. I like the current API. However, there is really no reason why m_BlocksSpan is part of core::Variable, it's an implementation detail. There would instead be a container of per-variable Spans inside the engine. This would allow you to solve the problem where multiple writers are using the Span interface. (Which isn't the exact bug that started this thread, I went with the simpler version that happens w/o Span.)

A similar idea holds for the m_BlocksInfo, though it's more complicated there, because some aspects of it are user-facing, so that'd require some careful thinking first (which I haven't done).

williamfgc commented 5 years ago

However, there is really no reason why m_BlocksSpan is part of core::Variable, it's an implementation detail.

@germasch the implementation is not bullet-proof, but encourages apps to keep IOs self-contained. There is a reason, though, while the Variable exists m_BlocksSpan exists. If m_BlocksSpan is moved to Engine two things could happen: 1) Variable addresses can become dangling (remove) and Engine will still try to collect spans which really need m_BlocksInfo, 2) Variable conflicts in Engine if two variables have the same name, but different type. We just went for the option that causes less damage and fits the IO self-contained model most apps use. Any API can be easily broken, this is why good practices are a must in every project. Good points, though.

cwsmith commented 2 years ago

I tried using one IO object to create two BP4 engines, one writer and one reader, and hit problems with variables being deleted when running more than one step (BeginStep & EndStep) despite each variable having a unique name. Switching to one IO object per engine fixed the problem.

If using one IO object per Engine is the suggested API usage, it would be nice to see that in the docs. Looking through the docs I see the IO interface section says

"Also, at reading always define separate IOs to avoid Variable name clashes."

and the Advice section mentions:

Always keep the IO object self-contained keeping its own set of Variables, Attributes and Engines. Do not combine Variables with multiple Engines or multiple modes, unless it’s 100% guaranteed to be safe in your program avoiding Variable access conflicts.

pnorbert commented 2 years ago

BP readers simply remove all variable definitions in EndStep to make sure they don't linger around. Reading outside of BeginStep..EndStep is invalid. And in the next BeginStep, a new set of Variables is created, instead of updating the existing map of variables.

adios2_reorganize uses one IO object to read in data then write it out. However, in general, this is not a safe approach as one does not know what a particular engine at runtime does to the IO object.

williamfgc commented 2 years ago

@cwsmith please clarify as the information seems to be included docs. Also, the behavior you're seeing is due to the more streaming oriented BP4, which considers each set of variables within a step as independent. I don't much see much benefit reusing IOs as global for several engines to new users, makes IO contents highly "volatile" as in the case you describe.

cwsmith commented 2 years ago

Hello. Thanks for the quick followup. I'm far from a seasoned ADIOS2 user so please excuse any misunderstanding I have of the context/usage.

@pnorbert

Thank you for the explanation of the BP variable cleanup. I saw calls to RemoveAllVariables from BeginStep but wasn't sure why.
I don't think the code was attempting to read/query outside of BeginStep-EndStep. I could have missed something though...
Thanks for the reference to the reorganize utility. IIUC, this is a tool to convert from one adios2 file format (BP4, HDF5, etc.) to another?

@williamfgc

Associating only one Engine per IO is fine for our use case.
Sorry for not being clear about the docs. In the Components Overview section the figure here appears to show two engines referencing one IO object. Combined with the text describing an IO object as an engine factory, which clearly isn't wrong, the pairing seems to imply that the multi-engine per IO case is OK/recommended.
Regarding the note in the IO interface section ("If two different engine types are needed ..."), I wasn't sure if this was specific to the case with different engine types (in my case the read/write engine pairs are the same type).
The Advice section section comment # 23 seemed ambiguous as it says IO objects should be 'self contained', but uses 'Engines' as opposed to 'Engine'. In the second sentence the meaning of combine was not clear ("Do not combine Variables with multiple Engines"). If there is a 'bad'/incorrect example that motivated this comment referencing/listing it could be helpful.

ornladios / ADIOS2

Two writers in one I/O #1316