Error "`bdb_bt_compare` was given keys of different length" is actually triggered

ddeclerck commented 5 months ago

Yeah, that's in some code I wrote 😅

This error should not happen, yet it does happen in one of our customer use case. I was provided the program that reads the "offending" indexed file, but I'm still expecting the program that creates it so I can analyze further.

Side note : the error does not trigger termination ; should we add a call to cob_hard_failure after cob_runtime_error ?

ddeclerck commented 5 months ago

Interestingly, the indexed BDB file has been produced by an external tool. Analyzing with db_dump reveals some internal pages containing short data (length 3/4), possibly interpreted as keys by GnuCOBOL (not sure what those are - maybe partial/intermediate keys ? - I don't know about BDB's internals). Feeding this BDB file through GCSORT using a COPY operation generates a new BDB file that does not contain these short datas, but preserving the actual file contents.

Anyways, should GnuCOBOL accept these files ? If so, the fix is rather easy: we should order keys even if they do not have the same length. I tried it and the given program behaves as expected.

ddeclerck commented 5 months ago

(probably should have notified you about this one, @GitMensch )

GitMensch commented 5 months ago

I've subscribed to all PRs and issue here, so get the starting post by mail :-)

ddeclerck commented 5 months ago

I've subscribed to all PRs and issue here, so get the starting post by mail :-)

Ah, alright, so eyes everywhere 👀 :D

GitMensch commented 5 months ago

I see, we speak about https://github.com/OCamlPro/gnucobol/blob/39344cf66085620e1dfdb16d54c7ba081efd1da8/libcob/fileio.c#L914-L936

Side note : the error does not trigger termination ; should we add a call to cob_hard_failure after cob_runtime_error ?

Definitely not within the function (a runtime error may also consist of multiple outputs) but it is commonly used the way that after the first we do call the second function.

Anyways, should GnuCOBOL accept these files ?

Depends. in general the ORGANIZATION INDEXED format has an internal structure, so if it is effort or possibly costly, then we may should either not do that or not by default. As-is we already have the size check there and would within this case do what cob_cmp_alnum() does (which we could extract from there into two functions and the "inner one" (that gets 2 times data pointer 2 sizes and one optional col) be made available via coblocal and use in fileio.c.

ddeclerck commented 5 months ago

Depends. in general the ORGANIZATION INDEXED format has an internal structure, so if it is effort or possibly costly, then we may should either not do that or not by default. As-is we already have the size check there and would within this case do what cob_cmp_alnum() does (which we could extract from there into two functions and the "inner one" (that gets 2 times data pointer 2 sizes and one optional col) be made available via coblocal and use in fileio.c.

I see that cob_cmp_alnum compares any extra character with spaces, but I don't think bdb_bt_compare should to that: it is used only when a collating sequence is specified, otherwise BDB defaults to an internal comparison function, which, according to the documentation, behaves as follows:

If no comparison function is specified, the keys are compared lexically, with shorter keys collating before longer keys.

GitMensch commented 5 months ago

We only use BDB to mimic ISAM and that means that keys must be sorted as-if like in COBOL and there alphanumeric compares behave as-if they would be of same size and space padded. If the file was completely created by libcob then the keys are of identical size so far, so that's no issue, if we add the handling for different size we may also do that if no collation is specified - but that may produce a recognizable overhead in which case we should only do this depending of an additional preprocessor variable...

ddeclerck commented 5 months ago

Then we have two choices:

either only accept correct files (in which case, GCSORT seems to be able to fix the offending files)
or add a specific comparison function that must be explicitly activated in configure (I'd delegate this task to someone else at OCamlPro ;) )

I guess both are fine.

GitMensch commented 5 months ago

1 and 2 can actually go together, as long as there's an option to switch. And as this is minimal and quite special, it is enough to have that directly with a preparser flag:

int bdb_bt_compare ()
{
#ifdef USE_BDB_KEYDIFF /* flag may passed with CPPFLAGS */
  /* code taking also "no col" and effectively doing the cob_cmp_alnum [either moved out or copy-pasted with a comment)] */
#else
  /* current code and additional cob_hard_failure */
#endif
}

Of course for USE_BDB_KEYDIFF (feel free to use a better name) the compare function needs to be always registered, also with an empty col,

ddeclerck commented 5 months ago

Alright, so no need to add a configure option - just need to document that flag somewhere.

I guess @engboris is up to the task ;)

engboris commented 4 months ago

Hello @GitMensch, I will be in charge of this fix. I have a question to make things clearer for me:

Could you explain your take on cob_hard_failure? I see that you include it in your last example ("and additional cob_hard_failure [...]"). In particular, I don't understand the second part of this message:

Definitely not within the function (a runtime error may also consist of multiple outputs) but it is commonly used the way that after the first we do call the second function.

GitMensch commented 4 months ago

The hard failure should be used next to (not within) the runtime error function.

I guess you'll not only add it (the fix part) but also include the variant that operates on not 100% confirming BDB files (always using a sort function), right?

Please target your PR to the 3.x branch.

OCamlPro / gnucobol

Error "`bdb_bt_compare` was given keys of different length" is actually triggered #154