github / semantic

Parsing, analyzing, and comparing source code across many languages
8.94k stars 453 forks source link

Remove uses of pathtype library. #677

Open patrickt opened 2 years ago

patrickt commented 2 years ago

pathtype was not a great success for us:

Furthermore, pathtype doesn't solve the most fundamental problem with the FilePath type currently in base: its String representation. The only valid representation for cross-platform file paths is ByteString, because Windows paths can contain unpaired UTF-16 surrogates. Upcoming revisions of the library are switching it to a ShortByteString representation, which is the Right Thing.

I think the lesson learned here is that "parse, don't validate" is not super practical when the entire world has built around validation of file paths rather than parsing them. Indeed, true validation of file paths would entail IO everywhere, as we'd want to check for the existence and validity of a file path.

robrix commented 2 years ago

1094

< {"files":[{"path":"semantic/test/fixtures/ruby/corpus/method-declaration.A.rb","language":"Ruby","symbols":[{"symbol":"foo","kind":"Method","line":"def foo","span":{"start":{"line":1,"column":5},"end":{"line":1,"column":8}},"nodeType":"DEFINITION","syntaxType":"METHOD","utf16CodeUnitSpan":{"start":{"column":4},"end":{"column":7}},"byteRange":{"start":4,"end":7}}]}]}

vs.

1096

> {"files":[{"path":"semantic/test/fixtures/ruby/corpus/and-or.A.rb","language":"Ruby","symbols":[{"symbol":"foo","kind":"Call","line":"foo and bar","span":{"start":{"line":1,"column":1},"end":{"line":1,"column":4}},"nodeType":"REFERENCE","syntaxType":"CALL","utf16CodeUnitSpan":{"start":{},"end":{"column":3}},"byteRange":{"end":3}},{"symbol":"bar","kind":"Call","line":"foo and bar","span":{"start":{"line":1,"column":9},"end":{"line":1,"column":12}},"nodeType":"REFERENCE","syntaxType":"CALL","utf16CodeUnitSpan":{"start":{"column":8},"end":{"column":11}},"byteRange":{"start":8,"end":11}}]}]}

welp

@patrickt: I have no explanation for this. I don't think I broke it with my merge conflict fix, but I can't rule it out. Thoughts?

(note that while these use the ruby fixtures, they are actually semantic's tests, and not semantic-ruby's)

robrix commented 2 years ago

Running the pre-merge commit locally with 8.10.7, I don't see that error, but I do see 31 errors due to uncaught NoSuchFile exceptions.

robrix commented 2 years ago

Same 31 failures with the master-merged commit with 9.2.1.

patrickt commented 2 years ago

@robrix Yeah I have absolutely no idea what this could be. Do we need to regenerate the fixtures or something?

patrickt commented 2 years ago

And why is it comparing one method-declaration test and one and-or test? That doesn’t seem right. Is there some ordering nonsense being thrown in here?

robrix commented 2 years ago

Let's see how far we get fixing the NoSuchFile exceptions; maybe that error was spurious?