Distinguish warnings from informational messages

tskeith commented 5 years ago

Currently f18 has two kinds of messages, fatal and non-fatal. Non-fatal messages may be warnings (indicating a potential error) or informational (e.g. extra information associated with another message). We should distinguish between those two kinds of non-fatal messages.

Some things that would need to be done to achieve this:

Change bool isFatal_ in message texts to have 3 states
Add another user-defined string literal for warnings
Change message output to put "warning:" in warning messages
Find existing non-fatal messages that should be warnings and change them
Change test_errors.sh to check warnings as well as errors and update expected results

Questions:

Should we have any other kinds of messages? For example, a message attached to another message to provide context might be distinguished from stand-alone messages that provide information about how the program is optimized.
Should we rename the user-defined literals to remove _en_US? Messages are implicitly in that locale so we could shorted the suffixes to _err, _wrn, _inf or something like that.
Do we want to add warning identifiers (issue #44) and should that be done at the same time?

klausler commented 5 years ago

I see this as being a good way to make the -Werror flag not fail compilations with messages that aren't warnings.

I'd like to keep _en_US to retain the option of having non-English messages in the base sources.

Language extensions are already associated with per-feature flags.

To avoid cluttering source code with message numbers, pointers to URLs and Fortran standard sections, &c., we should have a little database of distinct messages that can include their translations and links to further information.

tskeith commented 5 years ago

Regarding _en_US: why would we want to allow non-English messages in the source? That would mean when the compiler is built normally it would produce messages in more than one language. To get consistent messages it would have to go through a localization process.

klausler commented 5 years ago

If something like Kanji mode were implemented by actual speakers of Japanese, maybe they'd want to be able to emit idiomatic error messages in Japanese right in the source. That's the use case I was thinking about.

jeanPerier commented 5 years ago

we should have a little database of distinct messages

Do you mean something like flang error message tables errtxt and kanjitxt ? Would you have all messages in such database or only the core ones (not sure how "core" would be defined) ?

jeanPerier commented 5 years ago

Should we have any other kinds of messages? For example, a message attached to another message to provide context might be distinguished from stand-alone messages that provide information about how the program is optimized.

I would imagine that some people do not care about notes regarding optimizations and would be disturbed by them at first while notes regarding warning and errors are most of the time welcomed (maybe with the exception of when it leads to too much cascading). So I would say it is useful to make a difference. But do we really need a different state? Can't we just use the fact that the notes related to something else are passed with Attach ?

sscalpone commented 5 years ago

Fortran programmers expect to get information about what optimizations and transformations were performed and why others were not performed. For PGI, such information is controlled with the command-line options -Minfo and -Mneginfo.

psteinfeld commented 5 years ago

With respect to translating messages ...

In other projects that I've worked on, here's how we handled internationalization and localization. We separated the messages from the source code in such a way that it was possible for someone to take the product and build a foreign language version with minimal development/build/test effort. Ideally (and typically), the process of building a foreign language version of a product did not require compilation or linking. The goals of this process were to make the creation of a foreign language version of a product be low cost and high quality.

The creation of the foreign language version would proceed as follows. The starting point was that the product would contain a set of files that would contain the English versions of all of the messages. Some other organization or company would decide to pay for the translation. They would take the English versions of the message files and ship them to a translation center, which might be run by yet another company. The company paying for the translation would receive back the translated files and build and test the translated version of the product.

The effort of this build and test process should be minimal. Ideally, the build process would consist only of adding some files to the shipping product. The organization would then test the result to make sure that the translated messages appear, that they're formatted well, and that they make sense in the context in which they appear. For this latter step, ideally, the translating company would have access to a set of tests that cause the messages to appear in context.

Assuming that we follow something close to this process, this argues for keeping all of our messages in English and for keeping them in separate files. We should also make sure that we have tests that cause every message to appear in the compiler's output.

klausler commented 5 years ago

Assuming that we follow something close to this process, this argues for keeping all of our messages in English and for keeping them in separate files.

What do you mean by "separate files'?

psteinfeld commented 5 years ago

Assuming that we follow something close to this process, this argues for keeping all of our messages in English and for keeping them in separate files.

What do you mean by "separate files'?

I mean files whose content is almost exclusively the messages themselves. You want to easily extract all of the content that requires translation and then present this translatable content to a translation center. These translation centers typically have their own tools and data to perform translation at minimal cost.

klausler commented 5 years ago

Does that mean that the C++ code that is constructing a message is no longer doing so with the actual text of the message, but is instead using some kind of indirection through a magic message number/identifier?

psteinfeld commented 5 years ago

Does that mean that the C++ code that is constructing a message is no longer doing so with the actual text of the message, but is instead using some kind of indirection through a magic message number/identifier?

I'm not so familiar with existing C++ practice. Almost all of my previous experience was with Java, where we kept the messages in property files. Each line of the property file contained a name/value pair where the name was a symbolic name that appeared in the Java source, and the value was a string that represented the message. The Java source code that produced the message would reference the property name. The associated string would then be extracted from the property file to produce the user visible message.

tskeith commented 5 years ago

The requirement we are currently meeting is that translatable strings are identified in the source. So it would not be hard to extract them, have them translated, and put the translations back into the source.

I don't think we should be doing anything more than that now. In any event, that's not part of this issue -- it should be a new one.

klausler commented 5 years ago

Or the translations could be put elsewhere. The original strings in the source would still serve as keys.

flang-compiler / f18

Distinguish warnings from informational messages #476