ornladios / ADIOS2

Next generation of ADIOS developed in the Exascale Computing Program
https://adios2.readthedocs.io/en/latest/index.html
Apache License 2.0
268 stars 125 forks source link

Tools Interface in ADIOS2 #1274

Open rtschueter opened 5 years ago

rtschueter commented 5 years ago

In the last commits work on a TAU interface for ADIOS has started. Currently, the interface seems tailored to TAU. However, a generic tools interface would be of great value for performance tools in general. Are there any plans to provide a tools interface similar to the one of ADIOS1? On the one hand, there might be potential for cooperation between developers from different performance tools. On the other hand, developers of ADIOS and performance tools can discuss about supporting each other, e.g., what performance data should be provided by the tools.

eisenhauer commented 5 years ago

The short answer is "yes". In a longer form: The TAU work here is very exploratory, introduced more with the goal of helping to understand your latter point, what performance data should be provided by tools? In particular these commits are aimed at understanding the performance of the SST streaming engine. The challenge is that many performance-related aspects of SST (like the actual transfer of data) happen asynchronously and are impacted by reader behavior. So to understand SST performance we have to understand the interplay of the reader and writer, how buffer limits come into play, read patterns, etc. (The asynchrony and overlapping of some operations gives it a nuance that may be a bit outside the focus of standard HPC performance tools.) The upshot is that don't know exactly what we need to instrument yet and we aren't sure yet of the form we'd like that instrumentation to take. I.E. we may like some basic performance info built in to ADIOS2 so that it's available without using an external tool.

What has been merged is a placeholder so that we can work towards more useful performance understanding without minimizing issues of a code base that is still changing. We're absolutely interested in dialogue with performance tool developers. To the extent that you and your posse (4 thumbs up for this issue in two hours!) are willing/able to weigh in we're happy to collaborate.

sklasky commented 5 years ago

Can we get some of the TAU guys ( Sameer, Kevin) to talk to the ADIOS team to discuss this. Thanks

eisenhauer commented 5 years ago

We're working with Kevin and he is responsible for what's there now...

williamfgc commented 5 years ago

@khuck

khuck commented 5 years ago

@rtschueter @eisenhauer @williamfgc Agreed, what is there now is very TAU-centric - but it doesn't have to be. I am coming to ORNL for the CODAR meeting March 28-29, and will arrive 1 day earlier so I can attend the ADIOS meeting on the 27th. We should talk about how to make it a generic version at the ADIOS meeting.

eisenhauer commented 5 years ago

Sounds good Kevin. Philip and I had already tossed around the idea of genericizing this, but decided the thing to do was to get it in master as a placeholder first, to help us figure out what we needed, and then sort out making it more generic later...

rtschueter commented 5 years ago

@khuck @eisenhauer Was there any news about the tools interface during the CODAR meeting?

khuck commented 5 years ago

@rtschueter Unfortunately we didn't get as much time to talk about it as hoped, but I've modified the timers to now be rather generic: https://github.com/khuck/ADIOS2/tree/adiost_generic/source/adios2/toolkit/profiling/external

You can see the TAU implementation of this particular interface here: https://github.com/UO-OACISS/tau2/blob/master/src/Profile/TauADIOS2.cpp

...although I am in the process of changing that filename from TauADIOS2.cpp to TauGenericAPI.cpp because there's nothing ADIOS-specific about it.

Other than initialization, timer start/stop, counter sampling, metadata and thread registration, are there any other API calls that Score-P would need to support a basic profiling interface? If this works well with the ADIOS2 library, it would be great to encourage other libraries to implement something similar. Then as long as a tool is linked or preloaded into the environment, it would get the API calls.

Keep in mind this API is really only being used to investigate performance issues in SST, there isn't much instrumentation in the ADIOS2 library as a whole - yet.

khuck commented 5 years ago

Hey @rtschueter @eisenhauer @williamfgc @pnorbert -

I thought I would continue this discussion here... I extracted the stub library as a separate repo, hopefully to be used by other applications, libraries and tools: https://github.com/khuck/perfstubs (the repo will move from my personal area to the https://github.com/UO-OACISS if there is sufficient interest).

The project contains three directories - the perfstub_api is a "generic" version of the previous ADIOS stub timers that were TAU-specific, it is compiled and linked as a separate library. The implementation directory contains a dummy implementation of a printf tool, it is compiled and linked as a separate library. The examples directory contains a C and a C++ example. The etc/go.sh script will configure, build and run the examples as CTests (I tested on both OSX and Linux). It builds both a dynamic and static link version to test with both configurations. Hopefully I kept everything simple enough that it is all self-explanatory. If there are any questions, let me know. I tested the examples with tauexec and it seems to work for me (since I already added ```perftool*``` functions to the current TAU master). There's a README file with a few more details: https://github.com/khuck/perfstubs/blob/master/perfstub_api/README.md

Where do we go from here? Well, the code in perfstub_api could go right in as a replacement for the https://github.com/ornladios/ADIOS2/tree/master/source/adios2/toolkit/profiling/taustubs directory, or this perfstubs library could be an ADIOS2 submodule - there are benefits to both. Regardless, the instrumentation in ADIOS2 would have to be updated to the new API names.

Keep in mind I am not emotionally committed to anything here, I am asking for public comment because I would like it to work for as many people as possible. :) Let me know if I was clumsy or sloppy with namespaces, camel case, formatting, etc. or any other assumptions. Could this be done as a header-only library?

@rtschueter does this API seem sufficient for Vampir/Score-P support? Do you think other applications or libraries would be interested? I presented the existing ADIOS2/TAU version to some PNetCDF and HDF5 developers on a related project conference call ast week, they seemed interested. Also, if something like this already exists, let me know! I don't want to reinvent the wheel.

@williamfgc @pnorbert I'll let you decide how we should integrate the new instrumentation into ADIOS2 beyond this point. I can update the existing code and submit a pull request if you like.

Thanks!

pnorbert commented 5 years ago

We have a thirdparty/ folder for all external codes. Your repo should also go there. Then we can pull certain version or tag any time from the external repo to the ADIOS project.

On Wed, Apr 24, 2019 at 2:18 PM Kevin Huck notifications@github.com wrote:

Hey @rtschueter https://github.com/rtschueter @eisenhauer https://github.com/eisenhauer @williamfgc https://github.com/williamfgc @pnorbert https://github.com/pnorbert -

I thought I would continue this discussion here... I extracted the stub library as a separate repo, hopefully to be used by other applications, libraries and tools: https://github.com/khuck/perfstubs (the repo will move from my personal area to the https://github.com/UO-OACISS if there is sufficient interest).

The project contains three directories - the perfstub_api is a "generic" version of the previous ADIOS stub timers that were TAU-specific, it is compiled and linked as a separate library. The implementation directory contains a dummy implementation of a printf tool, it is compiled and linked as a separate library. The examples directory contains a C and a C++ example. The etc/go.sh script will configure, build and run the examples as CTests (I tested on both OSX and Linux). It builds both a dynamic and static link version to test with both configurations. Hopefully I kept everything simple enough that it is all self-explanatory. If there are any questions, let me know. I tested the examples with tauexec and it seems to work for me (since I already added perftool* functions to the current TAU master). There's a README file with a few more details: https://github.com/khuck/perfstubs/blob/master/perfstub_api/README.md

Where do we go from here? Well, the code in perfstub_api could go right in as a replacement for the https://github.com/ornladios/ADIOS2/tree/master/source/adios2/toolkit/profiling/taustubs directory, or this perfstubs library could be an ADIOS2 submodule - there are benefits to both. Regardless, the instrumentation in ADIOS2 would have to be updated to the new API names.

Keep in mind I am not emotionally committed to anything here, I am asking for public comment because I would like it to work for as many people as possible. :) Let me know if I was clumsy or sloppy with namespaces, camel case, formatting, etc. or any other assumptions. Could this be done as a header-only library?

@rtschueter https://github.com/rtschueter does this API seem sufficient for Vampir/Score-P support? Do you think other applications or libraries would be interested? I presented the existing ADIOS2/TAU version to some PNetCDF and HDF5 developers on a related project conference call ast week, they seemed interested. Also, if something like this already exists, let me know! I don't want to reinvent the wheel.

@williamfgc https://github.com/williamfgc @pnorbert https://github.com/pnorbert I'll let you decide how we should integrate the new instrumentation into ADIOS2 beyond this point. I can update the existing code and submit a pull request if you like.

Thanks!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ornladios/ADIOS2/issues/1274#issuecomment-486368061, or mute the thread https://github.com/notifications/unsubscribe-auth/AAYYYLLS22UUEIRLE5DP7HLPSCQABANCNFSM4G5ALTRQ .