OPM / opm-simulators

Simulator programs and utilities for automatic differentiation.
http://www.opm-project.org
GNU General Public License v3.0
110 stars 118 forks source link

Segfault on unknown keywords #1587

Open hnil opened 5 years ago

hnil commented 5 years ago

Example with compiled with mpi:

Exception caught: PARSE_UNKNOWN_KEYWORD: Keyword NARROW not recognized.
terminate called after throwing an instance of 'std::invalid_argument'
  what():  PARSE_UNKNOWN_KEYWORD: Keyword NARROW not recognized.
[hnil-workstation:16799] *** Process received signal ***
[hnil-workstation:16799] Signal: Aborted (6)
[hnil-workstation:16799] Signal code:  (-6)
[hnil-workstation:16799] [ 0] /lib/x86_64-linux-gnu/libpthread.so.0(+0x11390)[0x7fdf97020390]
[hnil-workstation:16799] [ 1] /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x38)[0x7fdf96c7a428]
[hnil-workstation:16799] [ 2] /lib/x86_64-linux-gnu/libc.so.6(abort+0x16a)[0x7fdf96c7c02a]
[hnil-workstation:16799] [ 3] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0x8c8f7)[0x7fdf977008f7]
[hnil-workstation:16799] [ 4] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0x92a46)[0x7fdf97706a46]
[hnil-workstation:16799] [ 5] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0x92a81)[0x7fdf97706a81]
[hnil-workstation:16799] [ 6] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(__cxa_rethrow+0x49)[0x7fdf97706d09]
[hnil-workstation:16799] [ 7] /home/hnil/Documents/GITHUB/OPM/opm_source/master/builds/default/opm-simulators/bin/flow(main+0x889)[0x8129a9]
[hnil-workstation:16799] [ 8] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0)[0x7fdf96c65830]
[hnil-workstation:16799] [ 9] /home/hnil/Documents/GITHUB/OPM/opm_source/master/builds/default/opm-simulators/bin/flow(_start+0x29)[0x849c29]
[hnil-workstation:16799] *** End of error message ***
Aborted (core dumped)

Is it a reason that default do not avoid segfault?

GitPaean commented 5 years ago

Hi, in your case, there is no segmentation fault, it is just an exception and the simulation gets terminated.

OPMUSER commented 5 years ago

@GitPaean Yes I know but for non-developers it gives the impression that something really bad has just happened.

Can we not sent this part of the print out to the terminal and the print file (or something similar) :

Exception caught: PARSE_UNKNOWN_KEYWORD: Keyword NARROW not recognized.
terminate called after throwing an instance of 'std::invalid_argument'
  what():  PARSE_UNKNOWN_KEYWORD: Keyword NARROW not recognized.
Flow has Terminated.

And this to the DEBUG file:

Exception caught: PARSE_UNKNOWN_KEYWORD: Keyword NARROW not recognized.
terminate called after throwing an instance of 'std::invalid_argument'
  what():  PARSE_UNKNOWN_KEYWORD: Keyword NARROW not recognized.
[hnil-workstation:16799] *** Process received signal ***
[hnil-workstation:16799] Signal: Aborted (6)
[hnil-workstation:16799] Signal code:  (-6)
[hnil-workstation:16799] [ 0] /lib/x86_64-linux-gnu/libpthread.so.0(+0x11390)[0x7fdf97020390]
[hnil-workstation:16799] [ 1] /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x38)[0x7fdf96c7a428]
[hnil-workstation:16799] [ 2] /lib/x86_64-linux-gnu/libc.so.6(abort+0x16a)[0x7fdf96c7c02a]
[hnil-workstation:16799] [ 3] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0x8c8f7)[0x7fdf977008f7]
[hnil-workstation:16799] [ 4] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0x92a46)[0x7fdf97706a46]
[hnil-workstation:16799] [ 5] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0x92a81)[0x7fdf97706a81]
[hnil-workstation:16799] [ 6] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(__cxa_rethrow+0x49)[0x7fdf97706d09]
[hnil-workstation:16799] [ 7] /home/hnil/Documents/GITHUB/OPM/opm_source/master/builds/default/opm-simulators/bin/flow(main+0x889)[0x8129a9]
[hnil-workstation:16799] [ 8] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0)[0x7fdf96c65830]
[hnil-workstation:16799] [ 9] /home/hnil/Documents/GITHUB/OPM/opm_source/master/builds/default/opm-simulators/bin/flow(_start+0x29)[0x849c29]
[hnil-workstation:16799] *** End of error message ***
Aborted (core dumped)

That way it is easy for non-developers to recognize it is just an input error.

GitPaean commented 5 years ago

@GitPaean Yes I know but for non-developers it gives the impression that something really bad has just happened.

Yes. We should give more intuitive message and even some instructions, like Please remove the unknown keyword NARROW and try again.

blattms commented 5 years ago

Same holds for the parameter system: OPM/ewoms#385.

joakim-hove commented 5 years ago

Exception caught: PARSE_UNKNOWN_KEYWORD: Keyword NARROW not recognized. terminate called after throwing an instance of 'std::invalid_argument' what(): PARSE_UNKNOWN_KEYWORD: Keyword NARROW not recognized.

Yes - it is a bit brutal. Everything can be improved, and of course also the error message issued here. For this type of errors you can use the ParseContext to control the behaviour when an error - i.e. missing keyword in this case - is encountered. To ignore the, or only warn, about the error PARSE_UNKNOWN_KEYWORD you could set an environment variable as this:

bash% export OPM_ERRORS_WARN=PARSE_UNKNOWN_KEYWORD
bash% flow CASE.DATA

We have discussed adding a commandline option for this as well - so you could do: flow --opm-errors-warn=PARSE_UNKNOWN_KEYWORD - but that is not implemented yet.

But: a bug fat warning is in place here: The moment you start ignoring unknown keywords like this chanes are high that the parser will eventually go astray, and you just up with another crash later - which is difficult to debug. That is the reason the default behaviour is to throw an exception when an unkown keyword is encountered.

So - if possible your best bet is to simple add the keyword.

OPMUSER commented 5 years ago

@joakim-hove and @GitPaean If the parser found an error could it not automatically switch to NOSIMmode to avoid assembling the deck - would that minimize the chance of a crash? If that is the case thenflow --opm-errors-warn=PARSE_UNKNOWN_KEYWORD should be always be set.

Also if NOSIM has been set in the deck or command line then I think flow --opm-errors-warn=PARSE_UNKNOWN_KEYWORD should be always be set.

In general I'm not a great fan of setting environment variables, a part from paths, as for each run I want to have an audit trail so I know what was used to obtain the results from the run, that is all the information should be in an output file. I'm just updating the manual for the 2018-10 release and currently we have 134 command line parameters so I think this getting a bit out of hand. I realize this is under review for the 2019-04 release.

I still think the re-direction would enhance the "experience" for the user.

alfbr commented 5 years ago

In general I'm not a great fan of setting environment variables

This view has actually dominated the choices made so far. Defaults are chosen so as to minimize the need for using command line switches.

currently we have 134 command line parameters so I think this getting a bit out of hand

Good point, I really have not given this any thought. Maybe we have come to a point where we should think more about the overall organization of the command line switches, e.g., some of them can be grouped rather than being boolean.

for each run I want to have an audit trail

This is already implemented. You will finda list of all values used for command line switches at the top of the PRT or DBG file. This was probably not in place for the 18.04 release though.

For the particular error message at hand, it is one we simply did not get around to improving. It comes across as developer centric, and could be made significantly more useful/friendly for users. My take on how we got there is the following: Firstly, our ambition early on for the parsing is that we would implement any keyword that came our way more or less immediately, which would move the need for user-centric messages onto the simulator. You will see this by the typically user-centric formatting of messages emitted on the simulator side (in LOG and PRT output). However, the parsing code does so much more than parsing. Indeed, it is looking more and more like a full fledged compiler. In this setting, there is actually a need for developer centric error messages (i.e., developers need to see where in the code this error came from and that it was an exception).

By now it seems clear that implementing parsing of all keywords and their options in the file format will not happen any time soon, so the need for more user-centric messages should probably be weighed in more. This effort will naturally need to involve core developers (since it will impact their work flow), which has made it difficult to prioritize, still is. Switching to NOSIM sounds like an interesting approach, so we can see if there is strong developer support for it.

joakim-hove commented 5 years ago

@OPMUSER: If the parser found an error could it not automatically switch to NOSIMmode to avoid assembling the deck - would that minimize the chance of a crash?

That certainly would make sense; but mind you; the problems I am talking about when you start ignoring unknown keywords is in the parsing process itself; the problem is that identifying where one keyword ends and another starts is surprisingly tricky, and when we start ignoring keywords chances that we will go completely astray increase dramatically.

@alfbr: By now it seems clear that implementing parsing of all keywords and their options in the file format will not happen any time soon

No - that is probably right; but observe that parser code has support for adding keywords runtime, that is not exposed/enable from flow - but we could consider doing it. Two possible alternatives:

  1. Add switch --load-keyword=KEYWORD.json - if the user has written json configuration file(s) for unknown keywords they can be fully internalized in the parser machinery - runtime. But of course this requires people to write valid json keywords.

  2. Add switch --ignore-keyword=KEYWORD1

Both of these options are more or less 100% ready in the parser code, but it is question of whether we want to expose that functionality in flow. Personally I think that would be very good!

alfbr commented 5 years ago

Both of these options are more or less 100% ready in the parser code, but it is question of whether we want to expose that functionality in flow.

From my end, that would be a welcome improvement. I don't think we should be afraid of adding more functionality to the command line switches. I am convinced that we can reorganize them to make them less overwhelming. For those who are less lenient to use command line switches, they can simply use the typical approach today (i.e., comment out unsupported keywords in the deck until it runs through).

OPMUSER commented 5 years ago

@joakim-hove Yes parsing an 1980's command input deck is not the easiest of tasks. Even the commercial simulator sometimes goes to "never never land" when it reads unrecognized keywords. As long as you know where the parser got up to before it crashed you can figure out what caused it.

There are two issues here one is simple typos (which we will always have to deal with) and the other is valid keywords not supported by the parser. For the latter we could just go through the commercial simulator's keyword list and "black list" keywords that are not recognized by the parser. I think that will be a worthwhile exercise, as in many cases users are running existing decks from the commercial simulator with OPM Flow. If we do this then we don't have to add --ignore-keyword=KEYWORD1 etc. We could just put the list in a file with the simulator install. Would that work or does the parser need a complete json file for each keyword (not a simple task given the number of keywords) or would just the keyword name be suffice?

For the command line parameters I was thinking, when I get some spare time, to write a Python GUI front end that would generate the command line parameters and enable batching multiple runs. I say Python as I don't know C++. I need to do this for myself first before "sharing"

hnil commented 5 years ago

making flow python command which can handle some of the higher level logic I think is a good idea, if it is acceptable for users... Even all solvers in one C++ executable could be solved if this is acceptable. A related problem is to know if keywords read by the parser actually is used. Default behavoir do at least say anything.

alfbr commented 5 years ago

when I get some spare time, to write a Python GUI front end

Sounds cool! But please be advised that command line switches are not stable in any sense. Right now they are rather developer centric, and developers are typically free to change them at their own will. Hence, a gui may need to be significantly updated every release. More long term we may look at how to run flow from Resinsight, and then cater to configuration options there.