helixbass / tree-sitter-grep

The Unlicense
4 stars 1 forks source link

How to handle unparseable file? #41

Closed helixbass closed 12 months ago

helixbass commented 1 year ago

Currently we're just .unwrap()'ing the call to Parser.parse()

Does it make sense to silently ignore unparseable files (that "look like" they should be of the expected file-type eg are *.rs)?

helixbass commented 1 year ago

(Similarly with the preceding .expect() that the file contents were valid UTF-8?)

helixbass commented 12 months ago

Ok reading the docs for tree_sitter::Parser::parse(), I'm not sure there's such a thing as an "unparseable file" to tree-sitter (I think it just does its best and leaves error nodes in the parsed tree?)

And I don't think any of the conditions that it describes for why it might return None should ever happen in our current usage so I think it's valid to keep the .unwrap() as an "invariant assertion"

(Similarly with the preceding .expect() that the file contents were valid UTF-8?)

This was never necessary since tree_sitter::Parser::parse() takes an &[u8], not an &str, so I got rid of it

That being said, it might be interesting to see what happens if you try and feed it something that's not valid UTF-8

But that seems like it could be part of a broader thing of handling different file encodings