Deprecate all instances of `gri30_mix` and `gri30_multi`

ischoegl commented 5 years ago

Cantera version

2.5.0a3

Operating System / Python/MATLAB version

all

Expected Behavior

The gri30_mix and gri30_multi phases are deprecated and are not part of all gri30.yaml versions (the version in test/work/python defines those, the default version does not). For consistency, all instances should be removed for 2.5, as the transport model can be specified using a flag. I.e. the differentiation of transport models in phases is largely obsolete:

In [1]: import cantera as ct

In [2]: gas = ct.Solution('gri30.yaml', 'gri30_multi')
CanteraError: 
***********************************************************************
InputFileError thrown by AnyValue::getMapWhere:
Error on line 16 of /usr/local/lib/python3.6/dist-packages/cantera/data/gri30.yaml:
List does not contain a map where 'name' = 'gri30_multi'
[...]
***********************************************************************

In [3]: gas = ct.Solution('gri30.xml', 'gri30_multi')

In [4]: gas.composite
Out[4]: ('IdealGas', 'Gas', 'Multi')

In [5]: gas = ct.Solution('gri30.yaml', transport_model='Multi')

In [6]: gas.composite
Out[6]: ('IdealGas', 'Gas', 'Multi')

Questions

There are two questions:

it is not straight forward to completely remove the gri_mix and gri_multi phases, as they may be widely used in user-developed code. Should there be a customized error message (or warning)?
the default behavior should be clarified: will the transport model default to mixture-averaged or should no transport manager be instantiated if the transport_model flag is not used? (i.e. under many circumstances, the transport manager is not used)

Actual Behavior

gri30.yaml are not consistent, i.e. some have hand-coded entries.

Deprecating gri30_mix is straight-forward (it is replacable by gri30):

$ find . -type f | xargs grep 'gri30_mix'
./samples/cxx/flamespeed/flamespeed.cpp:        IdealGasMix gas("gri30.cti","gri30_mix");
./test_problems/clib_test/output_output.txt:  gri30_mix:
./test_problems/clib_test/output_blessed.txt:  gri30_mix:
./test_problems/clib_test/clib_test.c:    int thermo = thermo_newFromFile("gri30.xml", "gri30_mix");
./test_problems/clib_test/clib_test.c:    int kin = kin_newFromFile("gri30.xml", "gri30_mix", thermo, 0, 0, 0, 0);
./test_problems/mixGasTransport/mixGasTransport.cpp:        IdealGasMix g("gri30.xml", "gri30_mix");
./test_problems/multiGasTransport/multiGasTransport.cpp:        IdealGasMix g("gri30.xml", "gri30_mix");
./test/matlab/TestImport.m:            gas = Solution('gri30.xml', 'gri30_mix');
./test/work/python/gri30.yaml:- name: gri30_mix
./include/cantera/kinetics/importKinetics.h: *        ok =  buildSolutionFromXML(root, "gri30_mix", "phase", th, kin)
./interfaces/matlab/toolbox/GRI30.m:        s = Solution('gri30.xml', 'gri30_mix');
./interfaces/cython/cantera/examples/onedim/flamespeed_sensitivity.py:gas = ct.Solution('gri30.xml', 'gri30_mix')
./interfaces/cython/cantera/examples/onedim/flame_fixed_T.py:gas = ct.Solution('gri30.xml', 'gri30_mix')
./interfaces/cython/cantera/examples/onedim/diffusion_flame.py:gas = ct.Solution('gri30.xml', 'gri30_mix')
./interfaces/cython/cantera/data/gri30_highT.yaml:- name: gri30_mix
./interfaces/cython/cantera/data/gri30_highT.cti:ideal_gas(name = "gri30_mix",
./interfaces/cython/cantera/data/gri30.cti:ideal_gas(name = "gri30_mix",
./data/inputs/gri30_highT.cti:ideal_gas(name = "gri30_mix",
./data/inputs/gri30.cti:ideal_gas(name = "gri30_mix",
[...]

There aren't many instances of gri30_multi:

$ find . -type f | xargs grep 'gri30_multi'
./test/work/python/gri30.yaml:- name: gri30_multi
./interfaces/matlab/toolbox/GRI30.m:        s = Solution('gri30.xml', 'gri30_multi');
./interfaces/cython/cantera/data/gri30_highT.yaml:- name: gri30_multi
./interfaces/cython/cantera/data/gri30_highT.cti:ideal_gas(name = "gri30_multi",
./interfaces/cython/cantera/data/gri30.cti:ideal_gas(name = "gri30_multi",
./data/inputs/gri30_highT.cti:ideal_gas(name = "gri30_multi",
./data/inputs/gri30.cti:ideal_gas(name = "gri30_multi",
[...]

Steps to reproduce

use newest development version (compiled from source)

ischoegl commented 5 years ago

All occurrences of gri30_mix and gri30_multi within cantera itself are eliminated in ~#735~ #737 (except for XML/CTI/YAML files). The issue of usage in existing user code remains.

speth commented 5 years ago

My opinion, which I acknowledge may differ from the expectations of some users, is that the input files that come with Cantera should mostly be regarded as examples, not a source of data to be used as the basis of applications built for scientific or engineering purposes. As such, I don't think that there should be an expectation that the phase names in these files is part of Cantera's stable API, and we should be able to change or remove these files as needed.

I think it still makes sense for the transport field of the phase definition to determine the default transport manager, with the transport_manager keyword argument being available as an override. I would not want to force the user to always have to specify the transport model while importing the phase.

bryanwweber commented 5 years ago

@speth I disagree with the following statement (recognizing that you acknowledged that your opinion my differ):

is that the input files that come with Cantera should mostly be regarded as examples, not a source of data to be used as the basis of applications built for scientific or engineering purposes

If these files were not intended as data, why are they in a folder called data? In any case, the cat is out of the bag now for GRI-3.0, and yes, people shouldn't use that for real simulations, and yes, there are other ways to specify this information, but in this case I come down very strongly on the side of don't break user's code without warning.

In other words, I am very strongly in favor of retaining the historical gri30, gri30_mix, and gri30_multi phase names for GRI-3.0 only. I think it could make a lot of sense to remove the use of these phase names internally if this is something that we don't want to encourage users to do. However, I don't see a good reason to put up with the turmoil caused by removing these phase names from the input file we distribute.

Regarding the differences between the versions generated by ctml2yaml, cti2yaml, and ck2yaml, these will be resolved by committing the canonical YAML file to the repo once I finish #693 and removing the conversion of the YAML input files at build time.

ischoegl commented 5 years ago

@bryanwweber ... I agree with you with regard of not breaking code, but I also agree that the imho superfluous phase definitions should be phased out (except for what transport_manager is used as the default).

My suggested solution would be to simply catch the gri30_mix and gri30_multi phase labels in the C++ layer (YAML parser only), issue a DeprecationWarning, and change the transport model internally. I.e. they should no longer be present in the YAML files.

bryanwweber commented 5 years ago

But if I'm interpreting what @speth said correctly, the phase specification is not superfluous and indeed preferred, i.e.

I think it still makes sense for the transport field of the phase definition to determine the default transport manager, with the transport_manager keyword argument being available as an override. I would not want to force the user to always have to specify the transport model while importing the phase.

ischoegl commented 5 years ago

True. I guess I meant to say that duplicate phase definitions are superfluous (sorry). I am essentially assuming that going forward, most YAML files will use the

  transport: mixture-averaged

setting as the default.

For gri30, the historical differentiation really only pertains to those lines, hence my suggestion to catch the _mix and _multi suffixes internally until they can be phased out.

decaluwe commented 5 years ago

But if we view the purpose of these cti files as providing examples, I think this provides a clear example of how one can vary the transport model for a given thermodynamic and chemical phase definition, from within the input file. The page definition itself is superfluous, but one could say that about gri-3.0, in general. It’s not the phase definition that is the resource, but the example of how one can mix and match transport models.

Sent from my iPhone

On Oct 29, 2019, at 8:52 PM, Ingmar Schoegl notifications@github.com wrote:

True. I guess I meant to say that duplicate phase definitions are superfluous (sorry). I am essentially assuming that going forward, most YAML files will use the

transport: mixture-averaged setting as the default.

For gri30, the historical differentiation really only pertains to those lines, hence my suggestion to catch the behavior internally until it can be phased out.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.

ischoegl commented 5 years ago

My biggest hangup about this is that multiple phase definitions are only differentiated by a single entry. Yet, they need to be hand-coded (i.e. generate a YAML file using ck2yaml, and then edit).

I believe that letting users know that the transport_model flag is all they need may be more instructive overall. A FutureWarning without immediate deprecation would be an adequate solution?

ischoegl commented 5 years ago

@decaluwe ... regarding CTI files, this may be pretty discipline specific. If you're only dealing with gas phase kinetics (like myself), there isn't much of a reason to ever open the CTI/XML or YAML files that are generated from chemkin input. If you're looking at other thermo phases, there's a lot more reason to look at (and edit) input files.

From my perspective, I have never viewed the CTI (or YAML) format as examples. It's really the python example suite (or matlab/C++) that illustrates how things are being used.

speth commented 5 years ago

Given the multiple issues that have cropped up with errors in gri30.cti, I would prefer the gri30.yaml file that we include to be exactly what is generated by ck2yaml, even if we don't continue to generate it on the fly [in order to simplify the requirements for the build environment].

I often find that the fact that the default phase for gri30.cti doesn't specify a transport model to be an annoyance, given that any input file that a user generates using ck2cti will have a transport model set, assuming that they provided transport data.

Regarding @decaluwe's point, I'm not sure whether the "multiple slightly different phase definitions" approach is the one that we want to recommend to users. I know I use the transport_model=... keyword argument far more often than I add an additional phase definition to the input file.

If there's a place in the C++ layer where we can catch the use of the gri30_mix and gri30_multi phase names and issue a warning without too much hassle, I would like to see them deprecated.

speth commented 5 years ago

Also, I'd like to point out that there's no case here where code silently goes from working to not working. Errors associated with gri30_mix and gri30_multi not being defined in gri30.yaml will only occur when updating code to replace use of gri30.cti and gri30.xml (which will eventually emit warnings when those two formats are deprecated).

decaluwe commented 5 years ago

For the record, the point I was making was not about “demonstrating the creation of multiple, slightly diff phase definitions in a single file,” but rather “demonstrating, in a single file, the creation of of phase definitions with varying transport models.” I.e., my suggestion wasn’t that the example teaches users to create multiple instances of the same phase, only with different transport models. More like if one is building a phase model and has not done so before, a common approach is to go to an existing cti file and copy example of what is demonstrated there. Even if proper documentation exists, my guess is that most users will start from a previous example before going through the docs. Again, if the cti files are there for demonstration purposes, having some form of explicit demonstration of how the difft transport models are instantiated at the input file level is a good and valid thing.

Now, there is probably a more efficient way to achieve this goal, in terms of having one model, and then a comment listing other keyword options. Also agree that multiple phase defs in a single input serves no purpose, as a user. Lastly, I’m not even necessarily opposing the depreciation, but it’s just something to keep in mind: if the input files are there for demonstration purposes, it would be good if they explicitly demonstrated a range of phase definition options, for the user.

Sent from my iPhone

On Oct 29, 2019, at 9:30 PM, Ray Speth notifications@github.com wrote:

Given the multiple issues that have cropped up with errors in gri30.cti, I would prefer the gri30.yaml file that we include to be exactly what is generated by ck2yaml, even if we don't continue to generate it on the fly [in order to simplify the requirements for the build environment].

I often find that the fact that the default phase for gri30.cti doesn't specify a transport model to be an annoyance, given that any input file that a user generates using ck2cti will have a transport model set, assuming that they provided transport data.

Regarding @decaluwe's point, I'm not sure whether the "multiple slightly different phase definitions" approach is the one that we want to recommend to users. I know I use the transport_model=... keyword argument far more often than I add an additional phase definition to the input file.

If there's a place in the C++ layer where we can catch the use of the gri30_mix and gri30_multi phase names and issue a warning without too much hassle, I would like to see them deprecated.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

bryanwweber commented 5 years ago

I guess I'm in the minority here...

It seems like the consensus is that users should not use the transport field of the phase definition (in the YAML input file) to specify the transport model, and instead the transport_model kwarg to the Python Solution class (and similar arguments for other interfaces) is the preferred method. Is that the consensus? If so, why is that the preference?

there's no case here where code silently goes from working to not working

This is true, but removing those phase definitions makes the advice we give much more complicated. If we leave them there, we can tell people "just change .xml or .cti to .yaml and it will just work™️". If we remove those phase definitions, we have to say "make the change, but also if you happen to be using this mechanism, you need to take this extra step sometimes but not always".

Given the multiple issues that have cropped up with errors in gri30.cti, I would prefer the gri30.yaml file that we include to be exactly what is generated by ck2yaml, even if we don't continue to generate it on the fly [in order to simplify the requirements for the build environment].

I think this is the best argument I've heard for removing those extra phase definitions. However, I sure hope we've caught all the differences at this point, given the work in #718 and that GRI 3.0 won't be changing in the future (I suppose we could have said this before...). So I hope that gri30.yaml won't need to be edited ever again and any hand-edits at this point won't need to be changed. In addition, once we have the YAML files committed, I would advocate that we no longer distribute the original GRI-3.0 input files or include them in the repository, except in the test files directory (perhaps once the CTI and XML formats are removed).

As I said before, I think it could make sense to remove the use of these extra phases in our examples and demonstrate some of the other ways to set the transport model. But I really don't think that we should remove them from the gri30.yaml (and .cti and .xml) that is distributed. In the end, I see retaining the extra phase definitions as very little maintenance burden for us, but removing them leading to a cavalcade of questions on the Users' Group that I don't want to have to answer.

ischoegl commented 5 years ago

In the end, I see retaining the extra phase definitions as very little maintenance burden for us, but removing them leading to a cavalcade of questions on the Users' Group that I don't want to have to answer.

I'll stay out of the rationale of what exact YAML version is distributed, but do have a preference for the one generated by default (i.e. ck2yaml output; also, I'd actually suggest keeping the original inputs so anyone can easily verify that they're unmodified). I.e. I'm with @speth on this one.

However, I think that the introduction of an entirely new format can bring some changes, which imho should not be entirely unexpected (all my code broke when cython was introduced in the transition to Python 3.x). Imho, some well-placed warnings is all that should be needed to nudge the users to the correct action, which should avoid the cavalcade of questions, which I agree no one wants to (or should have to!) answer.

Admittedly, some people will not read the actual messages that are issued, but this can be dealt with pretty efficiently ...

ischoegl commented 5 years ago

@bryanwweber

It seems like the consensus is that users should not use the transport field of the phase definition (in the YAML input file) to specify the transport model, and instead the transport_model kwarg to the Python Solution class (and similar arguments for other interfaces) is the preferred method. Is that the consensus? If so, why is that the preference?

The way I interpret this, the transport field should hold the default behavior (over-ridden by the transport_model kwarg), which can be later changed to whatever a user should prefer (see free flame calculations, which start out as Mix and then switch over to Multi). I.e. the current behavior (naming convention)

In [1]: import cantera as ct

In [2]: gas = ct.Solution('gri30.xml', 'gri30_mix')

In [3]: gas.name, gas.composite
Out[3]: ('gri30_mix', ('IdealGas', 'Gas', 'Mix'))

In [4]: gas.transport_model = 'Multi'

In [5]: gas.name, gas.composite
Out[5]: ('gri30_mix', ('IdealGas', 'Gas', 'Multi'))

is imho not consistent.

speth commented 5 years ago

It seems like the consensus is that users should not use the transport field of the phase definition (in the YAML input file) to specify the transport model, and instead the transport_model kwarg to the Python Solution class (and similar arguments for other interfaces) is the preferred method. Is that the consensus? If so, why is that the preference?

It's not that I think the transport field shouldn't be used in general, it's that I think multiple phase definitions is a bad way to select from among a set of possible models that work with the given phase (should we also add a phase definition for gri30_unity_lewis?).

I mainly want to get rid of the gri30_mix and gri30_multi phase definitions because their presence makes it harder for users to transition to using other mechanisms. If I start with an example that uses gri30.cti, the code to instantiate that with a mixture-averaged model is:

gas = ct.Solution('gri30.cti', 'gri30_mix')

But then, if I want to use my own mechanism that I have converted from a Chemkin input file, how am I supposed to understand that the correct way to adapt this is:

gas = ct.Solution('mymech.cti')

Or worse, that the corresponding call to:

gas = ct.Solution('gri30.cti', 'gri30_multi')

is

gas = ct.Solution('mymech.cti', transport_model='Multi')

I thought that the CTI -> YAML transition would be the best (if still imperfect) time to make this change, given that users have to make some modifications to their code to use the YAML format anyway, and that otherwise we don't really have a clear way to provide a deprecation warning relating to the use of a particular input file.

This is true, but removing those phase definitions makes the advice we give much more complicated. If we leave them there, we can tell people "just change .xml or .cti to .yaml and it will just work™️". If we remove those phase definitions, we have to say "make the change, but also if you happen to be using this mechanism, you need to take this extra step sometimes but not always".

I agree that it makes the instructions for "how to start using YAML" more complicated, but those instructions are at least a natural place to describe the appropriate modifications to the user's code, and I don't think they're really that complicated. Is there anything that isn't covered by the following (for Python, at least):

Replace ct.Solution('gri30.cti', 'gri30_multi') with ct.Solution('gri30.yaml', transport_model='multicomponent')
Replace ct.Solution('gri30.cti', 'gri30_mix') with ct.Solution('gri30.yaml')
Replace ct.Solution('foo.cti') withct.Solution('foo.yaml')wherefoo` is any other input file distributed with Cantera.

In any case, these instructions will also have to cover conversion of the user's own CTI / XML files using cti2yaml and xml2yaml, so it's not like the alternative is really a single line.

But I really don't think that we should remove them from the gri30.yaml (and .cti and .xml) that is distributed.

I am definitely not advocating removing these phase names from the existing CTI/XML files.

In addition, once we have the YAML files committed, I would advocate that we no longer distribute the original GRI-3.0 input files or include them in the repository, except in the test files directory (perhaps once the CTI and XML formats are removed).

Removing these files is an interesting suggestion. It would be consistent with the fact that we don't currently keep the original input files for other models (e.g. nDodecane_Reitz.cti). I'm wondering if that practice would have made it easier or more difficult to avoid any of the issues with gri30.cti that have cropped up. For the most recent one (the two changed rate constants), I think not having the original file in the repo would have made it almost impossible to understand how the error came about -- the header of the original input file with its different date was the necessary clue. I'm not sure I see the difference in having these original input files in the data directory versus test/data, given that they are only installed into the Python modules test/data directory in any case.

The way I interpret this, the transport field should hold the default behavior (over-ridden by the transport_model kwarg), which can be later changed to whatever a user should prefer (see free flame calculations, which start out as Mix and then switch over to Multi). I.e. the current behavior (naming convention) [...] is imho not consistent.

That's an interesting point, and not one I had considered. If a user looks at gas.name, they would be led to believe that the original transport model was being used, even if it had been changed, for example within the 1D solver.

ischoegl commented 5 years ago

All: I created PR #737 to illustrate my suggestion of catching the phase definitions with a transitional warning.

ischoegl commented 4 years ago

Alternative implementation proposed in #768 improves over #738.

ischoegl commented 4 years ago

This can be closed per #768

bryanwweber commented 4 years ago

I think this was partially resolved by #768, since I listed a URL for a wiki page in the warning message that doesn't exist 😂 I'd like to leave this open until that's resolved.

bryanwweber commented 4 years ago

Here is the wiki page: https://github.com/Cantera/cantera/wiki/deprecate_gri30_phases Comments/edits welcome

ischoegl commented 4 years ago

Here is the wiki page: https://github.com/Cantera/cantera/wiki/deprecate_gri30_phases Comments/edits welcome

Looks ok to me

speth commented 4 years ago

I made a couple of small edits. I think with that, we can actually close this now.

bryanwweber commented 4 years ago

Thanks @speth!

Cantera / cantera