grammarware / software-evolution

Software Evolution
MIT License
1 stars 0 forks source link

Picture Clause representation #6

Open DanielMocanovici opened 6 months ago

DanielMocanovici commented 6 months ago

I am getting confused by how a picture clause is defined in the data structure and what should and should not be considered an appropriate representation. Taking into account the following said in the paper: image More specifically that last sentence. The way I understand it, the representation can be almost anything. And if any characters which are not part of the semantics are encountered, then they are taken as their exact value. E.g. We could have "PICTURE IS XXCXX" which would represent any 2 characters followed by a char 'C' and then again any 2 characters. Another question that came up is whether Fields composed of both digits and characters can have leading digits or signs. Are the 2 following representations valid? 1)"S99999V999XXXXXX" and 2) "SXVXX999" where in the first example, the S and V become part of the digit representation of the number and in the second, they are part of the String representation and if soo, is their representation the character 'S', 'V' or is it '+/-', ' , / . '.

andreioff commented 6 months ago

I also have a follow up question about this: if characters and digits are allowed together in a picture clause, say "9X9X", does this mean that a field represented by such a picture clause would store a string of 4 characters where the 1st one is a digit char (e.g. '1' or '3'), the 2nd one is any character (including digit chars), the 3rd one a digit char again, etc.? Is it also true that upon storing any value in a field, that value must obey the picture clause of the field?

grammarware commented 6 months ago

In practice, COBOL developers usually do not mix 9s (and Zs, Vs, Ss, etc) with Xs (or As) and thus keep "types" either numeric or non-numeric. For your implementation it would be most logical to choose to pursue one of the two clean paths:

andreioff commented 6 months ago

The paper about BabyCobol mentions the following: "In modern high-level programming languages the closest analogy to a picture clause would be an object field with a getter ... and a setter that parses the provided value according to the same pattern.". Does this mean that 2 strings with different picture clauses but same amount of characters, say X PICTURE IS 123X(5) and Y PICTURE IS 456X(5), are different because their picture clauses are different, or they are considered equal as long as the 5 characters stored in them are the same? In other words, do any extra characters that are not part of the picture clause characters influence variable assignments and other operations (such as string concatenation: X + Y)?

grammarware commented 6 months ago

PICTURE clauses are all about data representation. If you define X PICTURE IS 123X(5), you are stating that this field will take up eight bytes, the first three of which are fixed and the other five can be anything depending on the values.

By definition, no valid value of X will ever be equal to any valid value of Y.

EVALUATE X + Y
    WHEN "123HELLO456WORLD"
        DISPLAY "QAPLA'"
END.

(To prepare for this, you also need to include 123 in the preceding MOVE statements. Think of the extra characters as units: if you define your MONEY IS PICTURE $999V99, then you don't want someone just executing the assignment MOVE €20 TO MONEY without raising an error or at least a warning)

andreioff commented 5 months ago

PICTURE clauses are all about data representation. If you define X PICTURE IS 123X(5), you are stating that this field will take up eight bytes, the first three of which are fixed and the other five can be anything depending on the values.

By definition, no valid value of X will ever be equal to any valid value of Y.

EVALUATE X + Y
    WHEN "123HELLO456WORLD"
        DISPLAY "QAPLA'"
END.

(To prepare for this, you also need to include 123 in the preceding MOVE statements. Think of the extra characters as units: if you define your MONEY IS PICTURE $999V99, then you don't want someone just executing the assignment MOVE €20 TO MONEY without raising an error or at least a warning)

Just a sanity check: the example for move makes sense, but then, all numeric variables need to have the same picture to be used in a clause? e.g. ADD X TO Y GIVING Z, do X, Y and Z need to have the exact same picture, say: $999? If yes, does that also apply to comparisons, e.g. using ==, or in that case it's enough to check the numeric value, and forget about the $ sign?

grammarware commented 5 months ago

The == part of the question is easy since == does not exist in BabyCobol 😈

ADD works with types that are not necessarily equal but perhaps equivalent or at least compatible. For instance, if you leave the dollar sign (and other weirdness) out, then usually all numeric types are compatible with one another, just remember that COBOL and BabyCobol do silent overflows and cutoffs and no rounding. I believe a good definition of this type compatibility would be that the numeric part is still there to do any ADD-ing on, and the non-numeric non-string fixed parts align. Then you would be able to add something of picture clauses $999 and $99.99, for instance.