Identifier's id field points to a Name

astral-sh / ruff

An extremely fast Python linter and code formatter, written in Rust.

https://docs.astral.sh/ruff

MIT License

32.23k stars 1.08k forks source link

Identifier's id field points to a Name #13797

Open samwgoldman opened 1 week ago

samwgoldman commented 1 week ago

This issue is just feedback for the AST's API, not a bug.

I've been working with the Ruff parser, which has been really great so far. Thanks a ton for providing this code! While working with the AST, I have found the naming around Identifier and Name nodes a bit awkward. Specifically, Identifier's id field is a Name, so I often end up with code that looks like id.id, because I use id to refer to the Identifier.

This is just a minor thing, but I thought I'd raise it. Would you consider changing the field to name eventually?

Thanks!

MichaReiser commented 1 week ago

Yeah, id.id reads a bit awkward. However, most of the time, it shouldn't be necessary to access the name directly. Instead, you can write &*id or id.as_str to get to the underlying string. You can also compare Identifier with a str etc.

Using name, unfortunately, can also be somewhat awkward:

func_def.name.name and the same forclass_def`.
global_stmt.name[0].name and the same for local_stmt.name[0]
except_handler.name.unwrap().name

We could consider renaming the field to text (TypeScript calls it escaped text), go for name regardless (same as jsshift) or value (swc).

Or we leave it as is and treat it as an internal name that you rarely need to access.

ndmitchell commented 1 week ago

In the code I'm writing (along with Sam), we often need to go from an Identifier to a Name, since we are deliberately keeping the Name intact as a name when passing it around, rather than reducing it to a &str to keep more type safety, and benefit from the underlying CompactStr representation. As a result, Name is being used directly quite a bit.

If you look at the types, Identifier is a pair of Name and Range, but Name is just the name. As a result, naming an Identifier with .name (as you do in StmtFunctionDef) is reasonable (semantically, it's the Name, just wrapped up with a bit more stuff), but naming the Name within an Identifier as id feels wrong (it lacks the Range information that would make it an Identifier). As a result of that, I don't find func_def.name.name to be that awkward - it's not the fact it's id.id that is awkward, but that you don't end up with the pieces of an id at the end.

MichaReiser commented 1 week ago

I don't fully agree with the reasoning about FunctionDef. IMO, func.name should be renamed to func.id to be consistent with renaming Identifier.id to name because the field's type is Name. For historical context, it used to be a str.

Either way. I don't think the name matters that much that it justifies the churn it creates for downstream tools. There are multiple projects using Ruff's parser today and while the rename isn't hard, it still imposes work on them for little gain, especially considering that it's arguable whether the new name is considerably better.

Having said that. I also don't care enough about the name. Feel free to go ahead with the rename if you feel strongly but consider the impact it has on other parser users.