pantsbuild / pants

The Pants Build System
https://www.pantsbuild.org
Apache License 2.0
3.33k stars 638 forks source link

Audit if we can use PyO3's new interning of `PyString` #15007

Open Eric-Arellano opened 2 years ago

Eric-Arellano commented 2 years ago

See https://github.com/PyO3/pyo3/pull/2269 for how it works. We should check if there are places we can use this.

seve-martinez commented 1 year ago

I did a little bit of digging here and found that PyString::intern was also added with this.

A quick search of the code yielded a few spots that this could be used: https://github.com/pantsbuild/pants/blob/020f8daeb54a723f688d4e02506bde8d623f45a3/src/rust/engine/src/externs/fs.rs#L211-L220

However with my limited understanding of PySnapshot i'm not sure if it's methods are good candidates. Are there some specific places you had in mind that could benefit?

I think it can also be used in many of the getattr calls like in the fs.rs module.

e.g.

impl<'source> FromPyObject<'source> for PyPathGlobs {
  fn extract(obj: &'source PyAny) -> PyResult<Self> {
    let globs: Vec<String> = obj.getattr(intern!(obj.py(), "globs"))?.extract()?;  // <-- like so
Eric-Arellano commented 1 year ago

cc @stuhood , any thoughts?

stuhood commented 1 year ago

However with my limited understanding of PySnapshot i'm not sure if it's methods are good candidates. Are there some specific places you had in mind that could benefit?

We already do some rust-side interning internally in PySnapshot (in the fs::Directory type), and that is for directory components (i.e. an/example/file/path would be interned as four separate components). That's probably a good example of where interning can help.

But interning entire file paths as in PySnapshot::files would rarely "hit" the intern cache, so that probably isn't worth it.


One thing to study would be the output of https://www.pantsbuild.org/v2.15/docs/reference-stats#memory_summary, to see if there are any large Python objects which could be made smaller. Here is some recent output: https://gist.github.com/stuhood/949078e32309b94bd14c905719483404

Nothing jumps out at me immediately though =/