lumeohq / xsd-parser-rs

A xsd/wsdl => rust code generator written in rust
Apache License 2.0
96 stars 34 forks source link

Name Collisions #103

Open mlevkov opened 4 years ago

mlevkov commented 4 years ago

Hello,

I have an XSD that I've used as input to the parser. Upon generation of code, it resulted in some struct fields to have name collisions or duplicate names with different associated types.

For example: This XSD -> https://gist.github.com/mlevkov/a9a5f77473668e7d53f64a2b5ee3ccfa

contains a type called SegmentTemplateType, which has other sub-related types, some of these types have same field names, which through the expansion of types results in duplicate naming, as such:

#[derive(Default, PartialEq, Debug, YaSerialize, YaDeserialize)]
#[yaserde(namespace = "urn:mpeg:dash:schema:mpd:2011")]
pub struct SegmentTemplateType {
    #[yaserde(attribute, rename = "media")]
    pub media: Option<String>,

    #[yaserde(attribute, rename = "index")]
    pub index: Option<String>,

    #[yaserde(attribute, rename = "initialization")]
    pub initialization: Option<String>,

    #[yaserde(attribute, rename = "bitstreamSwitching")]
    pub bitstream_switching: Option<String>,

    #[yaserde(rename = "SegmentTimeline")]
    pub segment_timeline: Option<SegmentTimelineType>,

    #[yaserde(rename = "BitstreamSwitching")]
    pub bitstream_switching: Option<Urltype>,

    #[yaserde(attribute, rename = "duration")]
    pub duration: Option<u32>,

    #[yaserde(attribute, rename = "startNumber")]
    pub start_number: Option<u32>,

    #[yaserde(rename = "Initialization")]
    pub initialization: Option<Urltype>,

    #[yaserde(rename = "RepresentationIndex")]
    pub representation_index: Option<Urltype>,

    #[yaserde(attribute, rename = "timescale")]
    pub timescale: Option<u32>,

    #[yaserde(attribute, rename = "presentationTimeOffset")]
    pub presentation_time_offset: Option<u64>,

    #[yaserde(attribute, rename = "indexRange")]
    pub index_range: Option<String>,

    #[yaserde(attribute, rename = "indexRangeExact")]
    pub index_range_exact: Option<bool>,

    #[yaserde(attribute, rename = "availabilityTimeOffset")]
    pub availability_time_offset: Option<f64>,

    #[yaserde(attribute, rename = "availabilityTimeComplete")]
    pub availability_time_complete: Option<bool>,
}

Notice that initialization appears twice and so as bitstream_switching.

I have taken a bit different approach in handling creating an XML parser based on the XSD case. I have taken the attributes and elements to the aggregate higher level type. As such, it provided an opportunity to explicitly identify parts of the type that belong to attribute and parts that belong to elements. Here is an example of the SegmentTemplateType following such an approach.

#[derive(Default, Debug, Clone, PartialEq, YaSerialize, YaDeserialize)]
struct SegmentTemplate {
    #[yaserde(flatten)]
    multi_segment_base: MultiSegmentBase,
    #[yaserde(flatten)]
    attribute: SegmentTemplateAttribute,
}

#[derive(Default, Debug, Clone, PartialEq, YaSerialize, YaDeserialize)]
#[yaserde(
prefix = "mpd",
namespace = "mpd: urn:mpeg:dash:schema:mpd:2011",
default_namespace = "mpd"
)]
struct SegmentTemplateAttribute {
    #[yaserde(rename = "media", attribute)]
    media: String,
    #[yaserde(rename = "index", attribute)]
    index: String,
    #[yaserde(rename = "initialization", attribute)]
    initialization: String,
    #[yaserde(rename = "bitstreamSwitching", attribute)]
    bistream_switching: String,
}

// <!-- Segment Timeline -->
#[derive(Default, Debug, Clone, PartialEq, YaSerialize, YaDeserialize)]
struct SegmentTimeline {
    #[yaserde(flatten)]
    element: SegmentTimelineElements
}

#[derive(Default, Debug, Clone, PartialEq, YaSerialize, YaDeserialize)]
#[yaserde(
prefix = "mpd",
namespace = "mpd: urn:mpeg:dash:schema:mpd:2011",
default_namespace = "mpd"
)]
struct SegmentTimelineElements {
    #[yaserde(rename = "S")]
    s: Vec<S>
}

#[derive(Default, Debug, Clone, PartialEq, YaSerialize, YaDeserialize)]
struct S {
    #[yaserde(flatten)]
    attribute: SAttributes
}

#[derive(Default, Debug, Clone, PartialEq, YaSerialize, YaDeserialize)]
#[yaserde(
prefix = "mpd",
namespace = "mpd: urn:mpeg:dash:schema:mpd:2011",
default_namespace = "mpd"
)]
struct SAttributes {
    #[yaserde(rename = "t", attribute)]
    t: u64,
    #[yaserde(rename = "n", attribute)]
    n: Option<u64>,
    #[yaserde(rename = "d", attribute)]
    d: u64,
    #[yaserde(rename = "r", attribute)]
    r: Option<i64>,
}

a work-in-progress version you can find here: https://gist.github.com/mlevkov/094a3c671b175bba5d0a6511d8c0d348

I'm not certain yet if the decision to consider "expansion/extension" as base_field the right one, but at the moment, I'm thinking of how to make sure there is no "confusion" or "collision" on the similar naming for various parts of the structure. Hence, I've not stumbled upon a case when collision would occur, but once I have seen the output from the generated effort by the xsd-parser-rs, I realized that it would be at least worthy of mention that such took place. Thus giving you an opportunity to consider such a case with an example.

Side point note, interesting that the various approaches you've taken are the very things that I've stumbled upon while making a parser for DASH media type. Kudos to the great thought process.