tafia / quick-protobuf

A rust implementation of protobuf parser
MIT License
446 stars 82 forks source link

Fix issues with nested items / name resolution #224

Closed snproj closed 1 year ago

snproj commented 1 year ago

Problems

This PR aims to:

However, the causes of these 3 issues (item resolution system) are relatively intertwined, which is why this is 1 PR.

Changes

Giving package names to messages (For #160)

Click to expand
Originally, each top-level message was given an empty package name, regardless of whether the parser detected a package name (e.g. `package some.package;`) for a file. This was changed to add package names to messages, so that package names could be used in name resolution (**#160**). This should still allow for blank package names if no `package` statement was written in the `proto` file, or the `--single-mod` option was used. ### Diff
Show diff
Under `pb-rs/src/types.rs > impl FileDescriptor > fetch_imports()`: ```diff for m in &mut self.messages { - m.set_package("", &self.module); + m.set_package(&self.package, &self.module); } for m in &mut self.enums { - m.set_package("", &self.module); + m.set_package(&self.package, &self.module); } ```

Namespace search (For #133, #160)

Click to expand
For each message or enum, the system at present creates a vector `v` of possible names that the message or enum could possibly take (each corresponding to a larger and larger namespace) to search through. Due to the implementation, some names were missed. This PR fixes **#133** by creating and pushing the rest of the possible names onto the stack in a for-loop. Also, to help fix **#160**, this is done in order (see note below!), so that the system searches progressively larger and larger namespaces, in order to comply with protobuf spec. > IMPORTANT NOTE: Name resolution order followed was based on [this](https://github.com/protocolbuffers/protobuf/issues/6296). ### Diff
Show diff
Under `pb-rs/src/types.rs > impl FileDescriptor > resolve_types() > rec_resolve_types()`: ```diff let test_names: Vec = if name.starts_with('.') { vec![name.clone().split_off(1)] } else if m.package.is_empty() { - vec![name.clone(), format!("{}.{}", m.name, name)] + vec![format!("{}.{}", m.name, name), name.clone()] } else { - vec![ - name.clone(), - format!("{}.{}", m.package, name), + let mut v = vec![ format!("{}.{}.{}", m.package, m.name, name), - ] + format!("{}.{}", m.package, name), + ]; + for (index, _) in m.package.match_indices('.').rev() { + v.push(format!("{}.{}", &m.package[..index], name)); + } + v.push(name.clone()); + v }; ```

Prevent overwriting in name resolution map (For #160, #170)

Click to expand
The system uses hashmaps to match an identifying package+name to each message or enum. Originally, these could get overwritten during the name resolution process, leading to messages from imports overwriting more locally defined messages. Preventing such overwrites seems to solve **#160 and #170**. ### Diff
Show diff
Under `pb-rs/src/types.rs > impl FileDescriptor > get_full_names() > rec_full_names()`: ```diff if m.package.is_empty() { - full_msgs.insert(m.name.clone(), index.clone()); + full_msgs.entry(m.name.clone()).or_insert_with(|| index.clone()); } else { - full_msgs.insert(format!("{}.{}", m.package, m.name), index.clone()); + full_msgs.entry(format!("{}.{}", m.package, m.name)).or_insert_with(|| index.clone()); } ``` A few lines later: ```diff e.index = index.clone(); - full_enums.insert(format!("{}.{}", e.package, e.name), index); + full_enums.entry(format!("{}.{}", e.package, e.name)).or_insert(index); } ```

Minor incidental fix: is_unit()

Click to expand
`is_unit()` did not check nested messages; this was fixed by making it recursive. This was necessary to check if tests in this PR passed. ### Diff
Show diff
Under `pb-rs/src/types.rs > impl Message > is_unit()`: ```diff fn is_unit(&self) -> bool { - self.fields.is_empty() && self.oneofs.is_empty() + self.fields.is_empty() + && self.oneofs.is_empty() + && self.messages.iter().all(|m| m.is_unit()) } ```

Tests Created

Click to expand
- Issue 133 (no `.rs` test script, just checking if it can name-resolve and generate Rust files): - Basic case - `test_nested_basic_case_p0.proto` - `test_nested_basic_case_p1.proto` - A stress test - `test_nested_p0.proto` - `test_nested_p1.proto` - `test_nested_p2.proto` - `test_nested_p3.proto` - Issue 160: - Check if name resolution happens correctly in various scenarios - `test_name_resolution.rs` - `test_name_resolution_p0.proto` - `test_name_resolution_p1.proto` - `test_name_resolution_p2.proto` - `test_name_resolution_p3.proto` - `test_name_resolution_p4.proto` - Issue 170: - Stress test - `issue_170.rs` - `issue_170_a.proto` - `issue_170_b.proto` - `issue_170_c.proto` - `issue_170_d.proto` - `issue_170_e.proto`