src-d / lookout-sdk

SDK for lookout analyzers
Apache License 2.0
4 stars 11 forks source link

Feature request: support UAST cache #83

Open vmarkovtsev opened 5 years ago

vmarkovtsev commented 5 years ago

It would be great to have an ability to load UASTs from disk instead of parsing files every time. This is critical for ML benchmarks and quality evaluations, where we can analyze 1,000 predefined repositories at the same revisions. Thus we would have more stable timings, more stable experience (some driver may crash, e.g. cpp, or parse differently, e.g. js) and also run faster.

carlosms commented 5 years ago

ping @smola @src-d/product.

smacker commented 5 years ago

are we talking about UAST parsing for the content of pull requests or about the proxy to bblfshd?

vmarkovtsev commented 5 years ago

I meant UASTs which are supplied by the data service built into lookout-sdk binary.

smacker commented 5 years ago

thanks! just a note for Santiago and product: We can implement caching in lookout itself to avoid parsing the same files for each analyzer and make backend configurable. So lookout can store them in memory for a short period of time (analyzers are going to request uasts only after the event) and for sdk-bin we can use cache on disk that would solve this issue.