qgis / QGIS

QGIS is a free, open source, cross platform (lin/win/mac) geographical information system (GIS)
https://qgis.org
GNU General Public License v2.0
10.38k stars 2.98k forks source link

Order of attributes in QGS XML is non-stable #29192

Open qgib opened 5 years ago

qgib commented 5 years ago

Author Name: Adam Liddell (@aaliddell) Original Redmine Issue: 21375 Affected QGIS version: 3.6.0 Redmine category:project_loading/saving


When saving a project, the order of the attributes on the XML elements in the saved QGS file appears to be non-stable. When trying to version control a project, this causes massive diff 'noise' from attributes being reordered, even when the actual data in the attributes is identical.

For example, on one save, an element may be written as: @<layer-tree-layer source="?query=SELECT..." id="some_layer" providerKey="virtual" expanded="1" name=some_layer" checked="Qt::Unchecked">@

The next time the file is saved, the order of the attrbutes may change for the same element: @@

Simply resaving a file immediately after a previous save does not appear to trigger this, so there is some 'change' in the project that causes the attributes to reorder. As a proposed solution, the attributes on the elements should be ordered stably, perhaps alphabetically?

The order of the XML elements within the document appears to be stable.

Tested in 3.4.4 and 3.6.0.

qgib commented 5 years ago

Author Name: Alessandro Pasotti (@elpaso)


IMO this is a won't fix, the attribute order is random due to QT internal implementation https://doc.qt.io/qt-5/qhash.html#algorithmic-complexity-attacks

You can control with an env var as explained in the docs.

qgib commented 5 years ago

Author Name: Adam Liddell (@aaliddell)


Ok, I see the issue with using QDomDocument in QT5 to get stable ordering. However, whilst using QT_HASH_SEED would work on one machine, the output is supposedly not necessarily stable across different machines or versions even when set to 0, which doesn't solve the (perhaps minor) problem of using shared version control on a project.

Would there be benefit to post-processing the generated XML into canonical form (https://en.wikipedia.org/wiki/Canonical_XML), as suggested here: https://geidav.wordpress.com/2013/02/27/deterministic-attribute-order-in-xml-documents-in-qt-5/? Perhaps as an option if you don't want to pay the price for every project.

m-kuhn commented 4 years ago

This has been improved in version 3.6 and together with the trackable project files plugin the XML is now considerably stable.

thbaumann commented 4 years ago

@m-kuhn: Thanks for the trackable project files plugin. I was just wondering if it would be hard to implement this as a part of QGIS core (C++)? I guess at the moment it is a neccessary workaround to save the project file, read it again, parse it and save it another time. Especially for big project files this takes quite a while. As far as I understood it's due to limitations of qt5 at the moment. Would it be possible to keep the project file content in the working memory until it is saved and then read / parse it before saving it to disk?

m-kuhn commented 4 years ago

@rdbath you are right, it's the save and read again part (and using another xml writer) that makes it impossible to have it in core.

As far as I can see the current situation is caused by a limitation of QtXml.

The only feasible solution I can see is this workaround here which could be accepted if we bend the rules.

thbaumann commented 4 years ago

@m-kuhn: thanks for the heads up. I wonder if the qt project itself will fix the random sorting of the XML... The workaround sounds interesting. Would this be hard to implement in QGIS? At my workplace we set up a nagios file-change-check (https://exchange.nagios.org/directory/Plugins/Security/checkfilechange/details) for Email notification yesterday and apart from that we git push the QGIS-projectfile (Template) if there were changes. I have no idea how many people will make use of tracking changes in QGIS project files but I think it is good to track the changes of important project files.

m-kuhn commented 4 years ago

I wonder if the qt project itself will fix the random sorting of the XML

I guess not

Note that the module will not receive additional features anymore. For reading or writing XML documents iteratively (SAX), we recommend using Qt Core's QXmlStreamReader and QXmlStreamWriter classes. The classes are both easier to use and more compliant with the XML standard.

Would this be hard to implement in QGIS?

It's not the "hard to do" it's the "bend the rules" part that makes me hesitate. On the other hand given the number of people who would like this to happen, we could actually really bend them.

thbaumann commented 4 years ago

@m-kuhn : it's probaby a newbie question but... which rule would be bent? ( one of https://docs.qgis.org/testing/en/docs/developers_guide/codingstandards.html ) ?

m-kuhn commented 4 years ago

None of these.

But: using one XML library to generate the output, reading it again with a different library just to write it again within a single application is very poor practice. That's why I felt much more comfortable doing a plugin for it in the first place.

nyalldawson commented 4 years ago

Another thing to consider: there's been rumblings on the qt development list that QXmlStreamReader and writer are on their way out...

elpaso commented 4 years ago

Reading the stack overflow accepted answer cited above, there is apparently an easy solution that makes the attribute order repeatable across machines. It worths a try.

pathmapper commented 3 years ago

This topic came up recently on the user mailing list: http://osgeo-org.1560.x6.nabble.com/QGIS-file-version-control-tt5486909.html

As reported in https://github.com/qgis/QGIS/issues/35964, this is also an issue with QGIS .model3 files which makes it currently troublesome to use git for version control of a QGIS model.

Edit: Likely the same issue with .qml and .qlr files.

rduivenvoorde commented 3 years ago

And about @nyalldawson argument as QXmlStreamReader/Writer being phased out:

At least in current Qt6 it's still in: https://doc-snapshots.qt.io/qt6-dev/qxmlstreamwriter.html and https://doc-snapshots.qt.io/qt6-dev/qxmlstreamreader.html#

pathmapper commented 3 years ago

From https://github.com/opengisch/qgis_trackable_project_files/issues/17#issuecomment-852182714 regarding extending the plugin to styles and models:

The approach to reformat can (most likely) be reused without any further ado. We'll need to check (and possibly extend) the API for appropriate signals to do this automatically post save.

luipir commented 3 years ago

FYI this thread @damateos

kannes commented 8 months ago

For Qt6 https://doc.qt.io/qt-6/qtxml-index.html says:

Note: Qt XML will no longer receive additional features. For reading or writing XML documents iteratively (SAX), use the QXmlStreamReader and QXmlStreamWriter classes. The classes are both easier to use and more compliant with the XML standard.

That sounds like QXmlStreamReader and QXmlStreamWriter will stay at least for lifetime of Qt6. So maybe a solution based on those would be viable for QGIS?


Having nice small diffs for projects were QGIS projects, styles, etc are tracked in Git would be fantastic, especially for better acceptance of non-technical users when they can see the few(er) actual changes in diffs. :)

andreasneumann commented 8 months ago

See https://codereview.qt-project.org/c/qt/qtbase/+/441704/2 and https://codereview.qt-project.org/c/qt/qtbase/+/441705/4

kannes commented 8 months ago

Nice, sounds like the stable attribute order was added with Qt 6.5 (which is the current LTS version of Qt 6)

kannes commented 1 week ago

@pathmapper and I are testing this with Qt6 builds right now: https://github.com/kannes/qgis-qt6-xml-order/

We will report here what we find.

kannes commented 1 week ago

Looks good: With modern Qt6-based QGIS builds, the order of attributes is stable. Woohoo!

You can see my findings in https://github.com/kannes/qgis-qt6-xml-order/pulls

So every test with Qt6 led to the same order of attributes, even if a file was saved with a different attribute order in between.

I "only" tested with a QGIS project (.qgs), a style file (.qml) and a layer definition (.lyr). It should be the same for model files (.model3) and any other XML I presume.