Ada-Rapporteur-Group / User-Community-Input

Ada User Community Input Working Group - Github Mirror Prototype
27 stars 1 forks source link

Ada.Strings.Text_Buffers.New_Line_Count and the new-line characters should be annotated as implementation-defined #74

Open nholsti opened 9 months ago

nholsti commented 9 months ago

The definition of Ada.Strings.Text_Buffers in A.4.12 seems to lack two "implementation defined" statements/annotations:

The subtype of New_Line_Count allows the value zero, but it seems that would not make sense. A statement that New_Line_Count must be a positive number would clarify.

Moreover, it is not clear if the effect of outputting the characters that "represent a new line" is standardized or is implementation defined. Is it equivalent to calling Ada.Text_IO.New_Line on the output file?

ARG-Editor commented 9 months ago

A few unorganized thoughts on this one:

New_Line_Count was originally added to make the postconditions describing the Character_Count complete. All of those things have been removed, so it's unclear to me that there is any remaining value to this constant. (There no longer is a way to predict the number of characters that will be returned by a specific Get routine.) It's probably a case of incomplete removal. But there's probably no important reason to remove it now.

It certainly is enough for the value of an object to be described as implementation-defined. See 13.7, for instance; there is no English statement that these constants are implementation-defined, the declarations are enough. The same is true here.

Since we never describe the "characters that represent a new line", that's unspecified. Perhaps we should have said "implementation-defined" instead (since the latter requires documentation), but the difference is minute.

Ada.Text_IO.New_Line does not necessarily have to be represented by characters; the definition allows doing it in other ways. So it seems impossible in general to say that the characters are those used in Text_IO. Nor do I see any sane way to say that calling some Text_IO.Put on these characters necessarily acts like New_Line -- the definition of Text_IO explicitly says that the nature of terminators isn't defined by the language (see A.10(8/5)), and that in particular that they may not be a sequence of characters.

So I don't see any way (in general) to state that property, even though it seems valuable to have some connection and practically the character sequences would be the same here and in Text_IO. I suppose we could have some Implementation Advice to the effect that if a Text_IO line terminator is represented by a character sequence, the same sequence should be used here. But I'm not sure what an implementation that wanted to be totally portable should do: there is no way in this package to find out the characters used so one could properly output New_Lines when needed. (I think an early version of this package had a string constant for that purpose.) Well, I suppose an implementation could write a New_Line to an empty buffer and then retrieve it to do the comparisons -- a pretty messy way to handle this issue.

What do others think -- Is it worth having an AI to add some Implementation Advice in this area?

jprosen commented 9 months ago

Le 23/12/2023 à 08:55, ARG-Editor a écrit :

What do others think -- Is it worth having an AI to add some Implementation Advice in this area? IIUC, a text buffer can hold several "lines". Since it contains only characters, there must be some sequence of characters used to separate lines. This sequence of characters is never returned to the user. I see no benefit in imposing, or even recommending, anything about that sequence, and I see no reason to connect it to the way IOs represent a new_line, in the case where it is a sequence of characters. This is a pure, invisible, implementation issue. -- J-P. Rosen Adalog 2 rue du Docteur Lombard, 92441 Issy-les-Moulineaux CEDEX https://www.adalog.fr https://www.adacontrol.fr

ARG-Editor commented 9 months ago

Jean-Pierre Rosen writes:

Le 23/12/2023 à 08:55, ARG-Editor a écrit :

What do others think -- Is it worth having an AI to add some Implementation Advice in this area? IIUC, a text buffer can hold several "lines". Since it contains only characters, there must be some sequence of characters used to separate lines.

Yes, of course.

This sequence of characters is never returned to the user.

That's not correct. All of the lines in the buffer are returned via a call to Get, and of course indirectly via calls to Image and related attributes. There's no way to request only one line from the buffer (that was one of the capabilities that was dropped when this package was simplified). So how the lines are separated is relevant to the user. They might want to know in order to have totally portable code to output multi-line Images.

I see no benefit in imposing, or even recommending, anything about that sequence, and I see no reason to connect it to the way IOs represent a new_line, in the case where it is a sequence of characters. This is a pure, invisible, implementation issue.

This conclusion was based on an incorrect understanding, so it isn't helpful. I don't think it can be connected to Text_IO.New_Line as that does not have to be characters and this does (as J-P noted). We could have Implementation Advice or an AARM note. Probably the best thing would be to include a string in the package spec with the characters used. (I had tried to avoid that with line-at-a-time Gets, but those can't usefully be used thru the 'Image interface, thus they were dropped.) Then one could write a loop to search for them and output the string with multiple Put_Lines (if you are a stickler for portability). I suppose one could also have a flag or function that indicated whether directly using Text_IO on the entire string was OK (it would be True in almost all implementations).

                     Randy.