utk-se / WorldSyntaxTree

Language-agnostic parsing of World of Code repositories
Other
20 stars 0 forks source link

Perform a single parse for a unique blob #17

Open robobenklein opened 3 years ago

robobenklein commented 3 years ago

I realize now this might not be a good idea since parsing the same content using different languages can result in different trees.

What we CAN do instead is dedup by FILENAME_HASH and FILECONTENT_HASH since that pair guarantees the same parser on the same content.