Closed khou2020 closed 5 years ago
All of the astrophysics codes are on github in http://github.com/amrex-astro
Our solid mechanics code is here: https://github.com/solidsuccs/alamo
All of the combustion codes are on GitHub at https://github.com/AMReX-Combustion
@khou2020 Incompressible flow solver:https://github.com/AMReX-Codes/IAMR
One small question. What I/O extension are you developing? Since I am a CFD developer, I am now considering writing codes to transform the .plt file to .hdf5 file so that the TecPlot can read it. Maybe we can share the experience.
Best.
Jordan
There already exist converters from AMReX plotfile to hdf5, in various forms and flavours. I imagine https://github.com/AMReX-Codes/amrex/tree/development/Tools/C_util/AmrDeriveTecplot might be related to the use case you're interested in. First places to look might be: for hdf5 C_util and for vtk Py_util. It's possible the particular converter I'm thinking of isn't on the github
@jmsexton03
Thanks so much. This info is really helpful. Let me read it.
Best.
AmrDeriveTecplot might do more than you want. It builds a set of quads or bricks in Tecplot-speak that represent the "flattened" hierarchy of data, and then either writes those out in ascii or binary in a (probably now dated) format. The result is intended to be read on a single processor and manipulated in Tecplot as a "finite element" data structure.
Thanks for the reply. Does anyone know which application is more I/O intensive?
@khou2020 Incompressible flow solver:https://github.com/AMReX-Codes/IAMR One small question. What I/O extension are you developing? Since I am a CFD developer, I am now considering writing codes to transform the .plt file to .hdf5 file so that the TecPlot can read it. Maybe we can share the experience. Best. Jordan
Hi:
I am one of the PnetCDF developer. We want to developer modules that do AMReX checkpoint, plot, and particle file in NetCDF format. Will be happy to share experience with other developers.
Best, kaiyuan
@khou2020 Make sure you interact with the AMReX team directly when you make progress with the NetCDF format. It would be interesting to understand the performance of such a capability with respect to the native I/O and other options, and weigh that with the increased flexibility of the format. There may also be ways to leverage some of the underlying software that implements the native format in order to help the performance of the other formats.
All of the astrophysics codes are on github in http://github.com/amrex-astro
Our solid mechanics code is here: https://github.com/solidsuccs/alamo
All of the combustion codes are on GitHub at https://github.com/AMReX-Combustion
@khou2020 Make sure you interact with the AMReX team directly when you make progress with the NetCDF format. It would be interesting to understand the performance of such a capability with respect to the native I/O and other options, and weigh that with the increased flexibility of the format. There may also be ways to leverage some of the underlying software that implements the native format in order to help the performance of the other formats.
We are developing an I/O module that writes AMReX checkpoint and particle file in a single NetCDF file instead of a directory hierarchy. In addition to the convenience of managing a single file, we hope it can improve I/O performance by reducing the number of files created and by having process sharing the file perform I/O simultaneously.
Among those applications, does everyone know if there is one or some application that is considered I/O intensive or known to have I/O performance problem?
We are developing an I/O module that writes AMReX checkpoint and particle file in a single NetCDF file instead of a directory hierarchy. In addition to the convenience of managing a single file, we hope it can improve I/O performance by reducing the number of files created and by having process sharing the file perform I/O simultaneously.
I've made a bad experience with such an N:1
mapping for MPI processes to files regarding its performance. It will at least depend on the underlying filesystem of your cluster how that scales up for multiple nodes, or not? I would be happy to be enlightened if I am totally wrong. Personally, my last understanding was that using a Poor Man's Prallel I/O (PMPIO) strategy with some N:M
mapping where M
is something like M = N / <processes per nodes>
yields simple yet effective results.
Have you thought about leaving the number of output files M
open as a runtime parameter. Other I/O libraries already do that.
amrex does that as well (N:M mapping, with the number of files being a runtime parameter), and this has led to extremely efficient IO - as far as I know, AMReX’s IO performance is about as good as it gets (although there is always the hope that someone will take over the IO for future architectures). The move to netCDF or HDF or other formats could enable this transition (to more widely supported IO), but also might allow for increased compatibility with various graphics and processing software. So, while we wouldn't want to discourage folks from adding and working on this sort of thing, you should realize that the performance bar to meet, at least on current machines, is pretty high.
amrex does that as well (N:M mapping, with the number of files being a runtime parameter), and this has led to extremely efficient IO - as far as I know, AMReX’s IO performance is about as good as it gets (although there is always the hope that someone will take over the IO for future architectures). The move to netCDF or HDF or other formats could enable this transition (to more widely supported IO), but also might allow for increased compatibility with various graphics and processing software. So, while we wouldn't want to discourage folks from adding and working on this sort of thing, you should realize that the performance bar to meet, at least on current machines, is pretty high.
If I did not misunderstand the code. Setting M to 1 in AMReX differs from NetCDF in 2 ways.
It is just not clear that putting all data into a single file is inherently useful. Since there is no universally recognized format for AMR data, there is already a fundamental incompatibility with every major analysis and graphics package for scientific data. Putting all data into a single file does nothing to address that issue. And doing so in the way you suggest would subvert a significant capability that makes amrex IO so blazingly fast (not to mention the crucial added capability of demand-driven control).
I just want to be sure that performance measurements of any alternative IO strategies be compared with the amrex native approach running AT DESIGN SPEC (demand-driven in parallel and optimized for known contentions).
I met Ann Almgren last October and learned from her that the parallel I/O performance for AMReX is slow, although I did not ask the specifics. She also told me she welcomes any new parallel I/O methods for AMReX framework if they can provide high performance. So, this is why we are reaching out to this community to help us understand more about the I/O demands from typical AMReX applications. Our goal is to develop a new PnetCDF-based I/O module in AMReX as an option to users and hope it can achieve high performance. Our focus is for large-scale runs.
Our understanding (@khou2020 and I) of the current implementation of I/O module in AMRex is a file is exclusively written by a single MPI process at any given time, which means when a process is doing I/O, many others are waiting for their terms. Existing parallel I/O libraries, such as PnetCDF and HDF5 (or even middleware MPI-IO) enables concurrent accesses to shared files, which can improve the computer efficiency and performance.
Writing everything into a single file does not automatically guarantee the best performance, but it provides a better data management, i.e. moving files around and dealing with large number of files. NetCDF and HDF5 are well defined, portable, self-describe file formats. Since PnetCDF and HDF5 libraries are built on top of MPI-IO, they take advantage of optimizations already in MPI. In addition, there are many 3rd-party software tools for data analysis and visualization. Any of these software gets an update, you automatically benefit from it.
We have prototyped the proposed I/O module using PnetCDF and would like to measure its performance and compare with the one used by AMReX currently. Therefore, we are hoping to obtain some application cases that are seeing high I/O costs. In the past, we did the same for FLASH developed at U. Chicago.
I'm sorry Wei-keng, I think you misunderstood.
AMReX parallel I/O performance is in fact excellent. What I said was that we had been waiting for the hdf5 performance to catch up, which it has.
I told you that we would welcome alternative formats/methods if they are performant -- it's not that we don't have something that works great, it's that we understand that different users may prefer different formats and we want users to be able to use AMReX as efficiently as possible (so if they have tools that read hdf5, it's understandable they would like a code that writes hdf5, as an example).
We are always happy when developers and users contribute functionality that works well and that others find useful.
Ann
On Mon, Jul 22, 2019 at 5:07 PM Wei-keng Liao notifications@github.com wrote:
I met Ann Almgren last October and learned from her that the parallel I/O performance for AMReX is slow, although I did not ask the specifics. She also told me she welcomes any new parallel I/O methods for AMReX framework if they can provide high performance. So, this is why we are reaching out to this community to help us understand more about the I/O demands from typical AMReX applications. Our goal is to develop a new PnetCDF-based I/O module in AMReX as an option to users and hope it can achieve high performance. Our focus is for large-scale runs.
Our understanding (@khou2020 https://github.com/khou2020 and I) of the current implementation of I/O module in AMRex is a file is exclusively written by a single MPI process at any given time, which means when a process is doing I/O, many others are waiting for their terms. Existing parallel I/O libraries, such as PnetCDF and HDF5 (or even middleware MPI-IO) enables concurrent accesses to shared files, which can improve the computer efficiency and performance.
Writing everything into a single file does not automatically guarantee the best performance, but it provides a better data management, i.e. moving files around and dealing with large number of files. NetCDF and HDF5 are well defined, portable, self-describe file formats. Since PnetCDF https://github.com/Parallel-NetCDF/PnetCDF and HDF5 libraries are built on top of MPI-IO, they take advantage of optimizations already in MPI. In addition, there are many 3rd-party software tools for data analysis and visualization. Any of these software gets an update, you automatically benefit from it.
We have prototyped the proposed I/O module using PnetCDF and would like to measure its performance and compare with the one used by AMReX currently. Therefore, we are hoping to obtain some application cases that are seeing high I/O costs. In the past, we did the same for FLASH http://flash.uchicago.edu/site/flashcode/user_support/flash_ug_devel/node75.html developed at U. Chicago.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/AMReX-Codes/amrex/issues/499?email_source=notifications&email_token=ACRE6YVHD2F2OS3V7MS6Y7TQAZDVRA5CNFSM4HYFAAGKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD2RQNKA#issuecomment-514000552, or mute the thread https://github.com/notifications/unsubscribe-auth/ACRE6YQXUZ52G3D6HA6FFUTQAZDVRANCNFSM4HYFAAGA .
I'll second Ann's comment -- AMReX I/O is very fast. So fast, we don't even worry about it in the runtime. It is never a significant part of the wallclock time of any of our simulations.
I see. We will stop this prototyping until someone need files in NetCDF format.
Right, for any production runs I have seen IO has never been a significant fraction of the runtime across the many amrex applications. We certainly have benchmark cases, and setups that generate piles of disk activity, but relative to the computation it's always been a tiny fraction.
Two areas where we could improve:
Hi:
I am developing an I/O extension for AMReX applications. I want to test my implementation on real applications. Aside from built-in examples, does anyone know where I can get those applications that use AMReX?