Porting F18 tests to use LLVM lit

LukeIreland1 commented 4 years ago

These commits contain:

Scripts that will port error tests
Verified ported error tests
Remaining ported error tests

The remaining code for porting the remaining generic, symbol, and modfile tests is mostly done, but needs adjusting to this format.

LukeIreland1 commented 4 years ago

Merging removed my last commit, just trying to fix this now

sscalpone commented 4 years ago

Why not change the tests to use the standard prefix?

LukeIreland1 commented 4 years ago

Why not change the tests to use the standard prefix?

My aim was to make minimal changes to existing tests, but that's doable, and I don't mind strongly either way.

Keeping the original prefixes may be beneficial to others, and you may be able to identify the test and/or it's objective more easily.

LukeIreland1 commented 4 years ago

Does this handle only tests from the semantics directory? If so, can that be mentioned in the commit message. What about tests in other directories?

Only error tests from the semantics directory, so I was explicit about that part, but not about it also being a semantics test. Given there are no other error tests (as far as I remember), do you think it will be okay to leave it like this? I will be explicit about the other directories, so I think it's unlikely to cause confusion.

CarolineConcatto commented 4 years ago

I think the TODO can still be there, but maybe in another line, not in the same as the ERROR!

LukeIreland1 commented 4 years ago

I think the TODO can still be there, but maybe in another line, not in the same as the ERROR!

Well the TODO invalidates the ERROR line as it is incomplete, and FileCheck will try to look for an invalid line.

CarolineConcatto commented 4 years ago

Another solution is to add the tests that are failing in the list of Expected Failures. But the way it is I believe is not the best.

LukeIreland1 commented 4 years ago

Another solution is to add the tests that are failing in the list of Expected Failures. But the way it is I believe is not the best.

This is what I was doing originally, but Rich said it was unfair on the rest of the test that was working basically, we'll get clarification later in scrum so I'll just wait till then.

sscalpone commented 4 years ago

I think the TODO lines are more about an incomplete compiler checks and less about an incomplete test. But let's not argue semantics. Those TODO lines are probably important reminders of work that needs to be completed. We do not want to lose that information. It's reasonable to ask the original author about their intent.

You might poll about changing ERROR to CHECK. That'll be inline with most lit tests in llvm & leave the TODO lines unchanged.

You might also ask to eliminate the ERROR on the TODO lines. That might meet a positive reception.

LukeIreland1 commented 4 years ago

You might poll about changing ERROR to CHECK. That'll be inline with most lit tests in llvm & leave the TODO lines unchanged.

You might also ask to eliminate the ERROR on the TODO lines. That might meet a positive reception.

We discussed the second options in the scrum just now, but I should have brought up standard prefixes again. Another option was to rename ERROR: to simply error: or ERR: when after a TODO:

LukeIreland1 commented 4 years ago

I'm going to do a poll on the prefix situation, but I will mention it may be easier to leave them as they are, because while tests would work by replacing everything with check, the custom prefixes are useful in displaying their purpose clearly.

👍 for keeping prefixes as they are. 👎 for changing all prefixes to CHECK

sscalpone commented 4 years ago

Do any other llvm projects use custom prefixes?

tskeith commented 4 years ago

The current "ERROR:" prefix means the message must come out as an error. How would that work if it is changed to "CHECK:"?

LukeIreland1 commented 4 years ago

The current "ERROR:" prefix means the message must come out as an error. How would that work if it is changed to "CHECK:"?

So the run line is "not"ed so it will only pass if the code throws an error. The prefix doesn't matter at all, you could call it whatever you want. I have tested before, but just verified again, swapping ERROR with CHECK works fine. But again, for the same reason TODO is useful to developers, so could be ERROR (easily distinguish tests from one another)

! RUN: not %flang -fdebug-resolve-names -fparse-only %s 2>&1 | FileCheck %s

subroutine s1
  implicit none
  !CHECK: More than one IMPLICIT NONE statement
  implicit none(type)
end subroutine

subroutine s2
  implicit none(external)
  !CHECK: More than one IMPLICIT NONE statement
  implicit none
end subroutine

klausler commented 4 years ago

Annoyingly, there's not actually much I can do here. It's attempted to port an incomplete test. I need to check with Rich if I should bother, if there are "TODO"s in there, as I previously intentionally didn't touch them, but he said it should still be able to port the rest of the test. I suppose the answer here is to remove all TODO lines from tests that aren't commented out, but really, if they are incomplete, they should be commented out.

Both of those ideas are bad. It's not the test that incomplete, it's the feature. Once the feature is complete, additional errors will be emitted, and the comment will help update the test once the anticipated error begins to appear.

tskeith commented 4 years ago

So the run line is "not"ed so it will only pass if the code throws an error. The prefix doesn't matter at all, you could call it whatever you want. I have tested before, but just verified again, swapping ERROR with CHECK works fine. But again, for the same reason TODO is useful to developers, so could be ERROR (easily distinguish tests from one another)

"ERROR:" isn't just useful for developers, it means something in the test: the message must come out as an error, not a warning or anything else. If you are proposing to change that behavior you should make that explicit so that it can be discussed.

LukeIreland1 commented 4 years ago

If unexpected errors are reported by the compiler or if they are at the wrong line, this does not catch it.

I can't see which file this is referring to, could you help me with that?

LukeIreland1 commented 4 years ago

Annoyingly, there's not actually much I can do here. It's attempted to port an incomplete test. I need to check with Rich if I should bother, if there are "TODO"s in there, as I previously intentionally didn't touch them, but he said it should still be able to port the rest of the test. I suppose the answer here is to remove all TODO lines from tests that aren't commented out, but really, if they are incomplete, they should be commented out.

Both of those ideas are bad. It's not the test that incomplete, it's the feature. Once the feature is complete, additional errors will be emitted, and the comment will help update the test once the anticipated error begins to appear.

What do you suggest? At the moment I'm just changing the case of the prefix in the TODO line so it isn't checked during testing.

LukeIreland1 commented 4 years ago

So the run line is "not"ed so it will only pass if the code throws an error. The prefix doesn't matter at all, you could call it whatever you want. I have tested before, but just verified again, swapping ERROR with CHECK works fine. But again, for the same reason TODO is useful to developers, so could be ERROR (easily distinguish tests from one another)

"ERROR:" isn't just useful for developers, it means something in the test: the message must come out as an error, not a warning or anything else. If you are proposing to change that behavior you should make that explicit so that it can be discussed.

No, I agree, that was in response to Steve's comment about changing/standardising it. I made a poll and hopefully it's explicit enough.

kiranchandramohan commented 4 years ago

If unexpected errors are reported by the compiler or if they are at the wrong line, this does not catch it.

I can't see which file this is referring to, could you help me with that?

There is a error line number check in test_errors.sh

sscalpone commented 4 years ago

"ERROR:" isn't just useful for developers

This issue solves itself once flang starts to prefix error messages with error: (along with warning:, note:, etc.).

sscalpone commented 4 years ago

If unexpected errors are reported by the compiler or if they are at the wrong line, this does not catch it.

If FileCheck is really unable to catch extra messages or spurious text, that's a fatal flaw. However, flang's existing test-checking tools ought to work with lit, or be easily changed to be compatible with lit. Lit is just a driver.

klausler commented 4 years ago

"ERROR:" isn't just useful for developers

This issue solves itself once flang starts to prefix error messages with error: (along with warning:, note:, etc.).

f18 emits error: on error messages.

sscalpone commented 4 years ago

f18 emits error: on error messages.

Cool. I don't know why I didn't know that.

I think this would be ok: !CHECK: error: More than one IMPLICIT NONE statement and !TODO: error: More than one IMPLICIT NONE statement

tskeith commented 4 years ago

If unexpected errors are reported by the compiler or if they are at the wrong line, this does not catch it.

I can't see which file this is referring to, could you help me with that?

Edit test-lit/semantics/allocate01.f90 (which currently passes) and remove or move one of the "ERROR:" lines. The test should then fail but it doesn't.

tskeith commented 4 years ago

If unexpected errors are reported by the compiler or if they are at the wrong line, this does not catch it.

If FileCheck is really unable to catch extra messages or spurious text, that's a fatal flaw. However, flang's existing test-checking tools ought to work with lit, or be easily changed to be compatible with lit. Lit is just a driver.

Right, test_errors.sh could be adapted to work with lit driving it. Probably that's what will have to happen with test_modfile.sh and test_symbols.sh anyway.

kiranchandramohan commented 4 years ago

If unexpected errors are reported by the compiler or if they are at the wrong line, this does not catch it.

If FileCheck is really unable to catch extra messages or spurious text, that's a fatal flaw. However, flang's existing test-checking tools ought to work with lit, or be easily changed to be compatible with lit. Lit is just a driver.

FileCheck can do this by, 1) Having a check with count, which will fail if the number of error messages differs. https://github.com/llvm-mirror/llvm/blob/master/test/FileCheck/check-count.txt 2) Having a line number along with the check. https://github.com/llvm-mirror/llvm/blob/master/test/FileCheck/line-count-2.txt

klausler commented 4 years ago

If unexpected errors are reported by the compiler or if they are at the wrong line, this does not catch it.

If FileCheck is really unable to catch extra messages or spurious text, that's a fatal flaw. However, flang's existing test-checking tools ought to work with lit, or be easily changed to be compatible with lit. Lit is just a driver.

FileCheck can do this by,

Having a check with count, which will fail if the number of error messages differs. https://github.com/llvm-mirror/llvm/blob/master/test/FileCheck/check-count.txt

Having a line number along with the check. https://github.com/llvm-mirror/llvm/blob/master/test/FileCheck/line-count-2.txt

The capability that is essential to me is being told the exact source line number associated with a missing error message or with an unexpected error message.

LukeIreland1 commented 4 years ago

If unexpected errors are reported by the compiler or if they are at the wrong line, this does not catch it.

If FileCheck is really unable to catch extra messages or spurious text, that's a fatal flaw. However, flang's existing test-checking tools ought to work with lit, or be easily changed to be compatible with lit. Lit is just a driver.

FileCheck can do this by,

Having a check with count, which will fail if the number of error messages differs. https://github.com/llvm-mirror/llvm/blob/master/test/FileCheck/check-count.txt

Having a line number along with the check. https://github.com/llvm-mirror/llvm/blob/master/test/FileCheck/line-count-2.txt

The capability that is essential to me is being told the exact source line number associated with a missing error message or with an unexpected error message.

Lit already provides the source line number for unexpected error messages.

sscalpone commented 4 years ago

Lit already provides the source line number for unexpected error messages.

How does lit know that the line is an unexpected error message vs. an uninteresting line of output?

tskeith commented 4 years ago

I suggest that for the purposes of reviewing this you add some tests that fail so that we can see what the output looks like. For example, missing error, extra error, message at wrong line, expected error comes out as non-error.

You would remove these before merging once the PR is approved.

LukeIreland1 commented 4 years ago

Lit already provides the source line number for unexpected error messages.

How does lit know that the line is an unexpected error message vs. an uninteresting line of output?

The compiler gives all the output to lit, specifies which parts are errors, and if it isn't expected via a prefix, it fails and displays this uncaught error.

kiranchandramohan commented 4 years ago

If unexpected errors are reported by the compiler or if they are at the wrong line, this does not catch it.

If FileCheck is really unable to catch extra messages or spurious text, that's a fatal flaw. However, flang's existing test-checking tools ought to work with lit, or be easily changed to be compatible with lit. Lit is just a driver.

FileCheck can do this by,

Having a check with count, which will fail if the number of error messages differs. https://github.com/llvm-mirror/llvm/blob/master/test/FileCheck/check-count.txt

Having a line number along with the check. https://github.com/llvm-mirror/llvm/blob/master/test/FileCheck/line-count-2.txt

The capability that is essential to me is being told the exact source line number associated with a missing error message or with an unexpected error message.

We can also add check-not in between the error messages to capture any unexpected error messages.

sscalpone commented 4 years ago

exact source line number

No please. That make tests hard to maintain. The FileCheck doesn't require line numbers. Having line numbers or counts does not help catch extra message because, as far as I know, there's no way to tell FileCheck to complain or count an arbitrary line.

check-not in between the error messages to capture any unexpected error messages.

Wouldn't that require a check not for all possible strings on all non-error lines?

Wait, is FileCheck smart enough to parse line numbers out of error messages? If not, how does it match up error messages with the precise line number?

LukeIreland1 commented 4 years ago

If unexpected errors are reported by the compiler or if they are at the wrong line, this does not catch it.

If FileCheck is really unable to catch extra messages or spurious text, that's a fatal flaw. However, flang's existing test-checking tools ought to work with lit, or be easily changed to be compatible with lit. Lit is just a driver.

FileCheck can do this by,

Having a check with count, which will fail if the number of error messages differs. https://github.com/llvm-mirror/llvm/blob/master/test/FileCheck/check-count.txt

Having a line number along with the check. https://github.com/llvm-mirror/llvm/blob/master/test/FileCheck/line-count-2.txt

The capability that is essential to me is being told the exact source line number associated with a missing error message or with an unexpected error message.

We can also add check-not in between the error messages to capture any unexpected error messages.

I suppose if they're unexpected, then it's hard to know what to put in the check-not lines. My understanding of lit, is there should be no errors thrown in order to pass, and if there is, it has to explicitly be matched correctly to pass. That means adding code that throws a different error causes a fail, and anything but the exact right test will cause a fail, so there's no need to check for what's not there.

LukeIreland1 commented 4 years ago

exact source line number

No please. That make tests hard to maintain. The FileCheck doesn't require line numbers. Having line numbers or counts does not help catch extra message because, as far as I know, there's no way to tell FileCheck to complain or count an arbitrary line.

Annoyingly, I found a way to make it complain about specific lines! I can remove it if it's unnecessary, but that was in order to make @tskeith happy really.

check-not in between the error messages to capture any unexpected error messages.

Wouldn't that require a check not for all possible strings on all non-error lines?

Wait, is FileCheck smart enough to parse line numbers out of error messages? If not, how does it match up error messages with the precise line number?

I'll give you a snippet of my changes.

  !WARNING: [[@LINE+3]]:{{[0-9]+}}:{{.*}} 'X' edit descriptor must have a positive position value
  !WARNING: [[@LINE+2]]:{{[0-9]+}}:{{.*}} Expected ',' or ')' in format expression
  !WARNING: [[@LINE+1]]:{{[0-9]+}}:{{.*}} 'X' edit descriptor must have a positive position value
  write(*,'(XX)')

This is from io10.f90. Yes, it checks the same warning message twice. I didn't write the original tests so no comment on that really.

klausler commented 4 years ago

We can also add check-not in between the error messages to capture any unexpected error messages.

I suppose if they're unexpected, then it's hard to know what to put in the check-not lines. My understanding of lit, is there should be no errors thrown in order to pass, and if there is, it has to explicitly be matched correctly to pass. That means adding code that throws a different error causes a fail, and anything but the exact right test will cause a fail, so there's no need to check for what's not there.

That is correct. These tests must fail if the error messages produced by the compiler do not exactly match the expected errors, and the output should clearly distinguish both the false positives and false negative cases. (Requiring an explicit marker to catch unexpected errors would be error-prone, verbose, and also impossible; how would one specify that a line with two expected errors shouldn't elicit three or more?)

LukeIreland1 commented 4 years ago

I suggest that for the purposes of reviewing this you add some tests that fail so that we can see what the output looks like. For example, missing error, extra error, message at wrong line, expected error comes out as non-error.

You would remove these before merging once the PR is approved.

With these changes I'll add now before I go home, it should cover everything

Missing error: My changes should fix that
Extra error: Some tests actually already have duplicate error/warning checks for the same line of code. The same as the old style test, there's no way to remove them besides manually finding them and removing them. If the old test doesn't add any extra, then it won't fail. If other people add them in future, the test will fail. -Message at wrong line: My changes should 🤞 fix that -Expected error comes out as non-error: My changes look at both the line number and if it's an error test, it has to have "error" in the appropriate output line

klausler commented 4 years ago

!WARNING: [[@LINE+3]]:{{[0-9]+}}:{{.}} 'X' edit descriptor must have a positive position value !WARNING: [[@LINE+2]]:{{[0-9]+}}:{{.}} Expected ',' or ')' in format expression !WARNING: [[@LINE+1]]:{{[0-9]+}}:{{.}} 'X' edit descriptor must have a positive position value write(,'(XX)')
This is from io10.f90. Yes, it checks the same warning message twice. I didn't write the original tests so no comment on that really.

The line in question deserves both warnings, since it violates the same constraint twice.

LukeIreland1 commented 4 years ago

!WARNING: [[@line+3]]:{{[0-9]+}}:{{.}} 'X' edit descriptor must have a positive position value !WARNING: [[@line+2]]:{{[0-9]+}}:{{.}} Expected ',' or ')' in format expression !WARNING: [[@line+1]]:{{[0-9]+}}:{{.}} 'X' edit descriptor must have a positive position value write(,'(XX)')
This is from io10.f90. Yes, it checks the same warning message twice. I didn't write the original tests so no comment on that really.
The line in question deserves both warnings, since it violates the same constraint twice.

My bad, didn't really look at the code, but fair enough. At least that part of my script works!

klausler commented 4 years ago

I suggest that for the purposes of reviewing this you add some tests that fail so that we can see what the output looks like. For example, missing error, extra error, message at wrong line, expected error comes out as non-error. You would remove these before merging once the PR is approved.

With these changes I'll add now before I go home, it should cover everything

Missing error: Missing ERROR: checks will cause the test to fail as is

Extra error: Some tests actually already have duplicate error/warning checks for the same line of code. The same as the old style test, there's no way to remove them besides manually finding them and removing them. If the old test doesn't add any extra, then it won't fail. If other people add them in future, the test will fail. -Message at wrong line: My changes should 🤞 fix that -Expected error comes out as non-error: My changes look at both the line number and if it's an error test, it has to have "error" in the appropriate output line

It is possible for one line of Fortran to legitimately elicit the same message more than once.

LukeIreland1 commented 4 years ago

I suggest that for the purposes of reviewing this you add some tests that fail so that we can see what the output looks like. For example, missing error, extra error, message at wrong line, expected error comes out as non-error. You would remove these before merging once the PR is approved.

With these changes I'll add now before I go home, it should cover everything

Missing error: Missing ERROR: checks will cause the test to fail as is

Extra error: Some tests actually already have duplicate error/warning checks for the same line of code. The same as the old style test, there's no way to remove them besides manually finding them and removing them. If the old test doesn't add any extra, then it won't fail. If other people add them in future, the test will fail. -Message at wrong line: My changes should 🤞 fix that -Expected error comes out as non-error: My changes look at both the line number and if it's an error test, it has to have "error" in the appropriate output line

It is possible for one line of Fortran to legitimately elicit the same message more than once.

Yep, misunderstanding. Extra errors shouldn't be a problem, but I still need to do more testing regarding breaking the test after porting to see if it correctly fails.

LukeIreland1 commented 4 years ago

allocate01.f90 isn't fixed, because lit seems to be funny with check lines containing identical(bar line number) strings consecutively. Will look further into this tomorrow.

tskeith commented 4 years ago

Please don't force push while the PR is being reviewed -- your reviewers can't tell what you have changed.

After it is approved and you are ready to merge you can squash together commits that should be.

kiranchandramohan commented 4 years ago

exact source line number

No please. That make tests hard to maintain. The FileCheck doesn't require line numbers. Having line numbers or counts does not help catch extra message because, as far as I know, there's no way to tell FileCheck to complain or count an arbitrary line.

All the lines with errors have the pattern "::: error" in them. FileCheck can count the lines with that pattern. For e.g. to say that there are only 3 errors you can have the following check. CHECK-COUNT-3: ::: error

check-not in between the error messages to capture any unexpected error messages.

Wouldn't that require a check not for all possible strings on all non-error lines?

I am hoping that a single CHECK-NOT "::: error" in between the expected error messages will catch that. It will probably bail out at the first unexpected error message.

Wait, is FileCheck smart enough to parse line numbers out of error messages? If not, how does it match up error messages with the precise line number?

Yes, FileCheck can parse regular expressions and a line number can be matched with [0-9]+. As shown in @LukeIreland1's example you can also specify expected line numbers without hardcoding it. For e.g with [[@line+1]] you are specifying that the error should be on the next line.

klausler commented 4 years ago

Wouldn't that require a check not for all possible strings on all non-error lines?

I am hoping that a single CHECK-NOT "::: error" in between the expected error messages will catch that. It will probably bail out at the first unexpected error message.

Would this directive have to be inserted into the test source files?

kiranchandramohan commented 4 years ago

If unexpected errors are reported by the compiler or if they are at the wrong line, this does not catch it.

If FileCheck is really unable to catch extra messages or spurious text, that's a fatal flaw. However, flang's existing test-checking tools ought to work with lit, or be easily changed to be compatible with lit. Lit is just a driver.

FileCheck can do this by,

Having a check with count, which will fail if the number of error messages differs. https://github.com/llvm-mirror/llvm/blob/master/test/FileCheck/check-count.txt

Having a line number along with the check. https://github.com/llvm-mirror/llvm/blob/master/test/FileCheck/line-count-2.txt

The capability that is essential to me is being told the exact source line number associated with a missing error message or with an unexpected error message.

We can also add check-not in between the error messages to capture any unexpected error messages.

I suppose if they're unexpected, then it's hard to know what to put in the check-not lines. My understanding of lit, is there should be no errors thrown in order to pass, and if there is, it has to explicitly be matched correctly to pass. That means adding code that throws a different error causes a fail, and anything but the exact right test will cause a fail, so there's no need to check for what's not there.

@LukeIreland1 I agree with the first part that there should be no errors to pass. But I am not sure about the second part that all errors should be explicitly matched correctly to pass. As far as FileCheck is concerned, it is getting a set of strings as input, it does not know what is an error message and what is not. I was thinking that FileCheck will match all the CHECK's and will be happy with that, if there are extra lines in the input it is not going to complain/error.

Wouldn't that require a check not for all possible strings on all non-error lines?

I am hoping that a single CHECK-NOT "::: error" in between the expected error messages will catch that. It will probably bail out at the first unexpected error message.

Would this directive have to be inserted into the test source files?

Yes. There are also implicit check-nots, see below. --implicit-check-not check-pattern Adds implicit negative checks for the specified patterns between positive checks. The option allows writing stricter tests without stuffing them with CHECK-NOTs.

LukeIreland1 commented 4 years ago

Yes. There are also implicit check-nots, see below. --implicit-check-not check-pattern Adds implicit negative checks for the specified patterns between positive checks. The option allows writing stricter tests without stuffing them with CHECK-NOTs.

That seems to have worked for me. --implicit-check-not error: will mean any error not captured (even duplicates of captured ones, will cause the test to fail. Need to test more, but thanks for highlighting this!

LukeIreland1 commented 4 years ago

Having to force push, because:

 $ git push myfork lit     
To git@github.com:LukeIreland1/f18.git
 ! [rejected]        lit -> lit (non-fast-forward)
error: failed to push some refs to 'git@github.com:LukeIreland1/f18.git'
hint: Updates were rejected because the tip of your current branch is behind
hint: its remote counterpart. Integrate the remote changes (e.g.
hint: 'git pull ...') before pushing again.
hint: See the 'Note about fast-forwards' in 'git push --help' for details.

Doing as it says seems to result in rebasing my commits on top of themselves (only solution is to completely reset the fork [which I'm not sure how to do without messing up the pull request])

LukeIreland1 commented 4 years ago

Closing to change approach and ease in lit (ideally in time for monorepo merge) rather than adding FileCheck in at the same time. Will also hopefully allow me to push without -f

flang-compiler / f18

Porting F18 tests to use LLVM lit #934