TabulationRecord verification is too strict for duplicate x values

jlconlin commented 3 years ago

The ENDF format allows for duplicate x values in a TAB1 (and other places). There is a verifyXValuesAreSorted function called during the TabulationRecord (i.e., TAB1) constructor. It's a little too strict and doesn't allow adjacent, duplicate x values.

whaeck commented 3 years ago

It actually does allow for duplicate adjacent values. There's the Material test you pointed out that has duplicate points and it passes that test. I also took the additional unit test you want to add for the TabulationRecord and added it to the current code (without your change in the is_sorted_until) and that test also runs properly.

whaeck commented 3 years ago

Now, I'm not saying there isn't an issue.

The following does cause an error:

      " 0.000000+0 0.000000+0         33          0          1          49228 1460  438\n"
      "          4          4                                            9228 1460  439\n"
      " 1.000000+1 1.725000+1 1.500001+1 1.850000+1 15.0000100 1.975000+09228 1460  440\n"
      " 2.700000+1 1.605000+1                                            9228 1460  441\n";

because the first value is evaluated to 1.5000010000000001E+01 and the second one to 1.5000010000000000E+01 by disco, which technically is not sorted.

jlconlin commented 3 years ago

Oh, so this is a disco issue, not an ENDF issue.

whaeck commented 3 years ago

I just tried your fix with the above ENDF snippet, it still fails.

A solution might be to truncate/round the values coming out of disco. That way, the tail gets cut off and the values become the same again. We'll have to look into this in more detail.

jlconlin commented 3 years ago

Does this mean we need to sigfig?

whaeck commented 3 years ago

... runs away screaming ...

whaeck commented 3 years ago

Seriously now: we need to look into this in more detail, and be sure that truncation is the right way to go.

nathangibson14 commented 3 years ago

I don't like the idea of truncation here; I'd rather understand why disco has this behavior than use the "quick fix".

That said, I don't think there is a use case for disco that will use 16 digits of precision, so I can't imagine truncating is a real problem.

nathangibson14 commented 3 years ago

On another note, as I've previously discussed privately with @whaeck, ENDFtk's verification checking is too pedantic in general. It's a design philosophy that would be incredibly difficult to overcome at this stage in development, but something to think about as we move forward.

I appreciate that ENDFtk is pedantic when used to create ENDF data.

I do not appreciate that ENDFtk is pedantic when reading existing data from file/string. I'd much rather get a warning along with a data structure that I can query than an error. Not all ENDF data is perfect, but it may still be useable if there weren't a fatal error.

whaeck commented 3 years ago

@nathangibson14 I definitely do not want to rush a solution to fix this until we fully understand what's going on. For now, it rarely occurs and can be fixed manually if need be - even though this can become cumbersome in certain situations.

As a side note: if not for ENDFtk's pedantic checking, we wouldn't have known about this - and there's probably something that would have gone wrong down the line.

whaeck commented 3 years ago

BTW: I hate floating point comparison

We can fix the comparison to include an epsilon but that doesn't fix the underlying issue. The number will technically still be out of order

nathangibson14 commented 3 years ago

I did some digging in disco and was able to recreate this behavior in a simple C++ script. The meaningful pieces are:

  int base = 1;
  int fraction = 50001;
  int exponent = 1;

  double a =
    ( base*pow(10, 5) + fraction ) *
    std::pow(10., exponent-5 );
  std::cout << a << std::endl;

  double b =
    ( base + fraction * pow(10., -5) ) *
    pow(10., exponent);
  std::cout << b << std::endl;

a uses the operation in disco, which gives the undesired value of 1.5000100000000002e+01. b is an equally rational way to do it and it gives the desired value of 1.5000100000000000e+01.

This is definitely a case where the imprecision of floating point arithmetic is at fault. My change is by no means going to absolve all of that imprecision -- it simply works better for the values in this example. I'll do some more testing to see if one option is better than the other more of the time.

jlconlin commented 3 years ago

So here is an example where this became a problem. I have a perfectly valid PENDF file (attached) for ^123>Te, processed using modern RECONR. See line 191 of the file:

 0.68256000 158.729179 0.68256000 158.729179 6.825600-1 158.7291815234 3  1

When loading this with Python ENDFtk, I get this error:

[error] X-values are not sorted
[info] X-value [232]: 0.68256
[info] X-value [233]: 0.68256
[info] Error while reading TAB1 ordered pairs
[info] Error encountered while parsing TAB1 record
[info] Encountered error while reading section 1 of file 3 of material 5234

I believe the error comes—as @nathangibson14 has pointed out—because of the conversion of the strings 0.68256000 and 6.825600-1 to floating point numbers. The second string has fewer significant digits and disco doesn't do a good job of making sure that all the extra digits in a floating point number are zeroed out.

nathangibson14 commented 3 years ago

I'll have some thoughts on possible changes to disco and whether they could help this issue later today.

But in the meantime, two big questions come to mind--

What is causing RECONR to put three consecutive points at the same energy? Two of them even have the same cross section!
Why is 0.6825600 being written to RECONR? Didn't @whaeck 's updates to disco's write make that default to scientific? The zero at the end means that it doesn't need the extra precision from floating point.

jlconlin commented 3 years ago

I'll have some thoughts on possible changes to disco and whether they could help this issue later today.

But in the meantime, two big questions come to mind--

What is causing RECONR to put three consecutive points at the same energy? Two of them even have the same cross section!

Why is 0.6825600 being written to RECONR? Didn't @whaeck 's updates to disco's write make that default to scientific? The zero at the end means that it doesn't need the extra precision from floating point.

I don't know. It is what has come out of the algorithm. Likely, the differences between the two values are in digits not shown in ENDF. Could we make some changes to avoid this? Probably. Is it worth it? Unclear.
I believe @whaeck's latest change tries to preserve as many significant digits as possible. So, by default, will not use scientific notation.

nathangibson14 commented 3 years ago

I just sent a long email with the conclusion that std::stod doesn't exhibit this problem, but both alternatives (including what's in disco) have issues. I'm a bit concerned about speed in making a switch to stod, (mostly because adding e into strings is a pain) but at least it's a viable option. More work to do. @whaeck has some old code that does this, too.

Regarding item 2, I'm quite sure disco defaults to exponential unless it needs extra digits. I bet the underlying values in that set of three have some rounding that causes the disco write logic to do something a little differently than I'd expect.

njoy / ENDFtk

TabulationRecord verification is too strict for duplicate x values #112