yakra / tmtools

Tools to aid in development of the TravelMapping project
0 stars 0 forks source link

DBFtrim Info Display: MaxIntD side effects #23

Closed yakra closed 6 years ago

yakra commented 6 years ago

Sometimes the info display shows a <- in the Data column for type N/F fields when extra 0s were not trimmed. There does not seem to be any effect on actual file output.

• Only observed in ROADS_ACF.yOrig.dbf

ah_blm      F   31  12  0.86811049 <- 0.86811049
ah_elm      F   31  12  0.54514911 <- 0.54514911
ah_length   F   31  11  0.03254181 <- 0.03254181
ah_seg_num  F   31  3   115 <- 115.000000000000000

All of the first 3 fields had "0000000" trimmed

yakra commented 6 years ago

added @ end of field info display:

//BUG TEST
if (!strcmp(tDBF.fArr[i].name, "ah_blm")) // repeat for each field of interest
{   cout << "DecCount = " << (int)oDBF.fArr[i].DecCount << '\n';
    cout << "MinEx0 = " << (int)oDBF.fArr[i].MinEx0 << '\n';
    cout << "MaxIntD = " << (int)oDBF.fArr[i].MaxIntD << '\n';
    return 0;
}//*/

ah_blm, ah_elm DecCount = 15 MinEx0 = 7 MaxIntD = 3

ah_length DecCount = 15 MinEx0 = 7 MaxIntD = 2

ah_seg_num DecCount = 15 MinEx0 = 16(includes decimal point itself) MaxIntD = 3

For the first 3 fields, strlen(MaxVal) is MaxIntD-1 less than len. It appears len is being calculated correctly, but MaxVal is not updated. For the info display, the '0' or '.' is added after MaxVal's null terminator, rather than replacing it.

ah_blm: all values >= 10 only go to 3 decimal places ah_elm: all values >= 10 only go to 3 decimal places ah_length: only value >= 10 (10.103) goes to 3 decimal places

One Value? DUDE. This is a perfect test case.

yakra commented 6 years ago

10.103 @ offset 236687, record # foo 0.03254181 @ offset 295632, record # bar (offsets are from a file culled to ah_length only)

When 10.103 is encountered, there's no way to know a longer string will be encountered later. Requiring a new MaxVal to have a greater IntD would break type C fields, which always have an IntD of 0.

Allowing Info Display to a "full-width" value would require: • separating out type C field comparison again • separating out length processing from MaxVal storage for N/F fields • reworking how the null terminator is stored. In total, this is more hassle than I feel like bothering with (especially if I'm looking to eliminate the switch statement in field.cpp). I'm fine with seeing a MaxVal in the Info Display that doesn't reflect the full MaxIntD, and needing to see that the value in the Max column is greater than strlen(MaxVal).

A possible workaround is to store a mvIntD in reserved bytes, and prepend MaxIntD-mvIntD characters ('#' or something) to MaxVal in the InfoDisplay. Yes. I like this...

But first, the issue of replacing the actual null terminator in situ must be addressed. A simple strlen should do the trick.

yakra commented 6 years ago

When 10.103 is encountered, there's no way to know a longer string will be encountered later.

Ergo,

Allowing Info Display to a "full-width" value would require: ... • reworking how the null terminator is stored.

...Because we've got to wait till later, when MinEx0 is known. Inserting a new null terminator MinEx0 places from the end of the string is problematic: How can I be sure it leaves the proper amount of decimal places? A file can conceivably contain strings with different counts of decimal places, E.G. 1.5, 1.25, 1.125, 1.0625, etc. I hadn't seen this out in the wild, but something told me it's wise to allow for this possibility.

A possible workaround is to store a mvIntD in reserved bytes, and prepend MaxIntD-mvIntD characters ('#' or something) to MaxVal in the InfoDisplay.

Nope. I found a case of "different decimal counts" in txdot-2015-roadways_48113.dbf. IE, MaxVal for Shape_Leng is 0.0020730099, which takes up all 12 bytes of the original field size. Shape_Leng also contains a value of 10.647177, which also fits within those 12 bytes, due to having fewer decimal digits. Thus, MaxIntD-mvIntD == 1, and one '#' is prepended. "#0.0020730099" implies a 13 byte field size, not the 12 bytes allocated in the original file.

So the actual fix is even simpler: Don't worry about mvIntD at all, and just add len-strlen(MaxVal) characters. Only dbftrim.cpp is affected.

yakra commented 6 years ago

e52ccdcdb185e963e0a5aceea51b88648b6d95d2 closes this.