alanz / ghc-exactprint

GHC version of haskell-src-exts exactPrint
BSD 3-Clause "New" or "Revised" License
70 stars 32 forks source link

deltaposition anchor inconsistency #62

Open lspitzner opened 6 years ago

lspitzner commented 6 years ago

Considering the elements of annsDP, and the anchor of DPs for the contained elements:

Consider this example:

list = [ some laaaaaaaaaaaaarge expresssion -- this comment has DP (0,1)

       , another thing -- comma at start of this line has DP relative
                       -- to the position before "[", i.e. DP (2,7)
       ]

-- might be re-formatted like this:

list = [ some
           laaaaaaaaaaaaarge
           expresssion -- still at DP (0,1)

       , another thing -- now would be DP (4,7)
       ]

That is, node-relative DPs change when using a different layout for some random child node. If at all possible, an encoding that does not change for such a re-format should be used instead. This would make it much easier for brittany to respect/retain such additional newlines.

(And imho the status quo is just inconsistent and surprising, and it requires additional statefulness when processing annotations.)

alanz commented 6 years ago

The model is used because the AST elements follow a definite structure, and must be able to nest, indent, etc.

The comments can appear anywhere, and simply take up what would otherwise be whitespace.

There is a substantial amount of complexity/subtlety to get the thing to actually work, so I am not confident that the requested change can be done easily.

I would welcome a PR to do it, so long as all the tests still pass with it.

lspitzner commented 6 years ago

I don't see how my encoding would not be able to nest, indent, etc. But you certainly have a better picture of the many special cases around this stuff.

But regardless, does the current model not break for refactoring operation such as, lets say, "transform a:[b,c] into [a,b,c]"? Because the DP of the comma between b and c would become unusable? It seems to me that the current design runs into problems sooner or later not only for re-formatting but also refactoring uses.

But I can completely understand that redesigning this would be an enormous amount of work. So I'd like to ask a different question: Would you be opposed to adding redundant information to the exactprint/annotation API, where this additional data is not used for exactprinting but available for downstreams like brittany? The "DP relative to the last non-whitespace character" would be such data that would enable brittany to support retaining empty lines inside literals (tuples, lists, etc.)

Of course that usecase is not terribly important, but I have encountered situations where such feature would be nice-to-have. Either way, thanks for considering.

rwe commented 5 years ago

I wrestled with this today too. All I wanted was to get apply-refact to actually delete unused language pragmas instead of leaving blank lines…but the annotation offsets are inscrutable and unpredictable. For example, given this:

{-# LANGUAGE Foo #-}

{-# LANGUAGE Bar #}

The DP of both of those is (0, 0), for some reason. But if you add an identifier:

{-# LANGUAGE Foo #-}

{-# LANGUAGE Bar #}
foo = 12

Then suddenly the DP of the second pragma has (2, 0). I can't tell if this is a bug in ghc-exactprint, or if it just has surprising semantics. But in either case, I'm not sure how use it to actually transform documents.