eminence / xmltree-rs

Reads an XML file into a simple tree structure
MIT License
39 stars 30 forks source link

Attribute namespaces are not preserved #13

Open jethrogb opened 6 years ago

jethrogb commented 6 years ago

Attributes can have namespaces, such as in:

<mdui:UIInfo>
  <mdui:DisplayName xml:lang="en">TestShib Test IdP</mdui:DisplayName>
  <mdui:Description xml:lang="en">TestShib IdP. Use this as a source of attributes for your test SP.</mdui:Description>
  <mdui:Logo height="88" width="253">https://www.testshib.org/testshibtwo.jpg</mdui:Logo>
</mdui:UIInfo>

(full XML at https://www.testshib.org/metadata/testshib-providers.xml)

The xml namespace on the lang attribute is not stored:

Element {
    prefix: Some(
        "mdui"
    ),
    namespace: Some(
        "urn:oasis:names:tc:SAML:metadata:ui"
    ),
    namespaces: Some(
        Namespace(
            {
                "": "urn:oasis:names:tc:SAML:2.0:metadata",
                "ds": "http://www.w3.org/2000/09/xmldsig#",
                "mdalg": "urn:oasis:names:tc:SAML:metadata:algsupport",
                "mdui": "urn:oasis:names:tc:SAML:metadata:ui",
                "shibmd": "urn:mace:shibboleth:metadata:1.0",
                "xml": "http://www.w3.org/XML/1998/namespace",
                "xmlns": "http://www.w3.org/2000/xmlns/",
                "xsi": "http://www.w3.org/2001/XMLSchema-instance"
            }
        )
    ),
    name: "DisplayName",
    attributes: {
        "lang": "en"
    },
    children: [],
    text: Some(
        "TestShib Test IdP"
    )
},
ctrlcctrlv commented 2 years ago

@jethrogb if you still care about this issue, I believe I've solved it in #33. It requires a semver-breaking API change. To make the upgrade more comfortable for API consumers I threw the entire weight of modern Rust trait derivation behind my newtype:

https://github.com/eminence/xmltree-rs/blob/03bc8020bbb79c8339ede2928bda0ed0febb687f/src/namespace.rs#L9-L13

Going even farther for users of old versions of @eminence's API, I decided to delegate all functions of XmlNamespace:

https://github.com/eminence/xmltree-rs/blob/03bc8020bbb79c8339ede2928bda0ed0febb687f/src/namespace.rs#L24-L39

There is no way within delegate crate to do trait impl delegation, so the last remaining trait, IntoIterator, I did by hand:

https://github.com/eminence/xmltree-rs/blob/03bc8020bbb79c8339ede2928bda0ed0febb687f/src/namespace.rs#L41-L47

I added finally the obvious opposite trait (FromIterator) that @netvl ought to consider upstream—

https://github.com/eminence/xmltree-rs/blob/03bc8020bbb79c8339ede2928bda0ed0febb687f/src/namespace.rs#L49

Due to the care taken to make the upgrade seamless I hope @eminence approves.

adri326 commented 2 years ago

I'm experiencing the same issue, and dropping namespaces in attributes is a bit of a deal-breaker. Is there an ETA on when the PR will be merged or a known blocker to the merge?

Switching to the PR repository instead of the crates.io version fixed the issue and did not cause any incompatibility with my existing codebase.

ctrlcctrlv commented 2 years ago

@adri326 For me it's not just a bit of a deal breaker so the MFEK project uses my fork until #33 is approved. If it never is, I guess I can maintain it in perpetuity and backport any changes (should any be made) here.

ctrlcctrlv commented 2 years ago

@adri326 If you're interested, I just pushed the two changes I was missing from this master branch and reconciled #19 with my master at MFEK/xmltree.rlib in MFEK@8ca231ce9abd1a984b9be3e4d6438bceef2f58c5 and MFEK@447dcd6a39e7a154d07a5fc764afddc254f798ae.

Hopefully the MFEK branches don't need to be a long-term solution for you, but they can be. The tests do pass. The work by @dyst5422 looked fine to me so that's why I merged it.

thomas725 commented 1 year ago

Hey there!

I just stumbled upon this issue (that reading an xml file with this library, manipulating the in memory tree and writing it back out to a file causes attribute namespaces to get lost, see also: https://gitlab.com/thomas351/xml-sorter-rs )

I read above that @ctrlcctrlv maintains a fork that fixes that issue, so I've switched my dependency to that repository: https://github.com/MFEK/xmltree.rlib - works fine :)

Would be great if the fix could be merged into a crates.io release.

eminence commented 1 year ago

Thanks for the bump. It sounds like the fix from @ctrlcctrlv in #33 has some good reviews and successful usage in their fork.

I'll try to review these changes and get some stuff merged soon. Thanks for your patience.

ctrlcctrlv commented 1 year ago

My fork works fine for my use case (and the use cases of many others) but the issues raised by @tombailo and others are serious concerns as the whole point of namespaces is being able to have the same XML attr name provided by two different DTD's.

@tombailo made an attempt to patch overtop #33 in #38. We could really use a discussion between all of us on how to move forwards, I hadn't make progress as you'd not been around but the code in #38 looks like it may be workable and fix all issues but is a radical change.