QGIS Enhancement: Fix handling of provider default value clauses/Autogenerate/nextval(...) handling
Date 2022/02/14
Author Nyall Dawson (@nyalldawson)
Contact nyall dot dawson at gmail dot com
Version QGIS 3.26/3.28
Summary
A consistent source of issues and user frustation in QGIS stems from how QGIS treats provider-side default values for attributes of newly created features.
The issues are seen in many situations, but in particular are commonplace with:
The "FID" attribute in a GeoPackage layer, which uses a "Autogenerate" default value clause
Default value clauses in Postgres tables, or columns from a Postgres table which use "serial" field types or series (resulting in nextval(...) type default clauses)
Default value clauses in MS Sql Server or Oracle tables
The primary key attribute in a Spatialite table
Some typical issues associated with these tables include:
QGIS gets confused in some circumstances and incorrectly treats the default value clause as a string literal value, which causes issues when QGIS attempts to
store a string like 'Autogenerate' or 'nextval(...)' in a numeric column
Attribute tables and form widgets treat the default value clauses text as a string value, so a user must manually delete out the "Autogenerate" or "nextval(...)"
text which shows as the default value for these attributes before entering their actual values
Processes in QGIS which transfer features between layers get confused and always treat the clause as a string literal. For instance, a typical workflow which results
in these issues is to create some new features in a Geopackage layer, leaving the FID column with its default "Autogenerate" value. Without saving the table, the user
then tries to export the table as a different data format, or runs a Processing tool on the layer. The export/tool fails, as it will attempt to store an "Autogenerate"
string value inside numeric FID columns.
Various attempts have been made to fix these issues in an ad-hoc (one-by-one) manner, but ultimately they are a result of an API limitation in QGIS and cannot be completely
solved without re-working the associated API.
Proposed Solution
In order to fix the underlying causes of these issues, it will be necessary to rework the QGIS API relating to handling default value clauses.
Currently, default values are handled in QGIS by:
When loading a layer into QGIS, the data provider (QgsVectorDataProvider) subclass will retrieve and store all default value clauses from the backend for all columns
in the table.
When features are created which are associated with that layer, the attributes will be initially set to a text string matching the provider's default value clause
for each attribute. (Via QgsVectorDataProvider::defaultValueClause(int fieldIndex))
When QGIS is saving a new feature to a layer (or updating an existing features' attributes), the value of each attribute is compared against the corresponding
QgsVectorDataProvider::defaultValueClause. If the values match, then NO value is specified for that column when generating the SQL to insert the new feature (or
update the existing one), so that the backend database will correctly use the default value clause to generate a value.
The problem with the current approach is that there is way to determine if an attribute for a particular feature has a value set OR should take the backend's
default value for that column WITHOUT querying the vector data provider. This is not always possible, e.g. when a feature has been "detached" from its original
layer via a copy/paste operation or other similar operation.
In order to resolve this, the following new API is proposed:
A new simple struct/class will be created for QgsUnsetAttributeValue. Initially this class will contain a single string member, which will contain the original
provider backend's default value clause. (This member is for presentation/debugging and API compatiblity purposes only, and has no further use in the API.)
The QgsUnsetAttributeValue class will be metatyped, so will be possible to store as a QVariant value.
When new features are created, the initial value for attributes will be set to QgsUnsetAttributeValue object values for any fields which have a provider-side
default value clause (as opposed to the string literal values currently used)
Instead of checking if an attribute's value is unchanged from the provider default by comparing its string value with the QgsVectorDataProvider::defaultValueClause
value, utility methods in QgsAttributes and QgsFeature will be introduced which test whether the attribute is set by checking whether the variant type stored
for the attribute matches the QgsUnsetAttributeValue type.
This new API will allow QgsFeature objects to be passed freely around all different parts of QGIS, from layer to layer, without risk of confusing a default
value clause with a literal string value.
API Compatibility
Currently, plugins or scripts may be using the following method to test whether a particular attribute from a feature matches the default value clause for
a provider:
if feature.attribute(5) == layer.dataProvider().defaultValueClause(5):
# skip this, it's a provider default value
continue
In order to avoid breaking this code, QgsUnsetAttributeValue will have an bool operator==( const QString& value ) const; operator which can test for
equality between the QgsUnsetAttributeValue and a string value. This operator will test whether the string value matches the QgsUnsetAttributeValue stored
provider default value clauses member (as described earlier).
Relationship to Qt 6/Python for QT work
An ongoing issue with the current PyQGIS "sip" based bindings is handling of NULL vs unset attribute values. This is currently handled by a PyQGIS specific
special "NULL" value, which is used like this:
if feature.attribute(5) == NULL:
# it's null!
as opposed to:
if feature.attribute(5) is None:
# it's never been set...
This distinction causes many issues in QGIS code, and discussion regarding the QGIS 4.0 API changes have leaned towards removal of this distinction and
treating NULL/not set values both as Python None values. This is also a necessary step toward possible future replacement of the "sip" based Python
bindings with "Python for Qt" based bindings.
This proposal will help alleviate this issue, by making the distinction between Null vs "not set" attributes clearer. For QGIS 4.0, "Null" attributes can then always be treated identically to Python None values, and "unset" attributes stored/treated as QgsUnsetAttributeValue values.
Deliverables
The proposed new API will be introduced into QGIS, and all code logic relating to handling of default value clauses updated to use the new API
Unit tests covering the new API/approach
Quality documentation in the PyQGIS/c++ code API describing the new API
QGIS Enhancement: Fix handling of provider default value clauses/Autogenerate/nextval(...) handling
Date 2022/02/14
Author Nyall Dawson (@nyalldawson)
Contact nyall dot dawson at gmail dot com
Version QGIS 3.26/3.28
Summary
A consistent source of issues and user frustation in QGIS stems from how QGIS treats provider-side default values for attributes of newly created features.
The issues are seen in many situations, but in particular are commonplace with:
nextval(...)
type default clauses)Some typical issues associated with these tables include:
Various attempts have been made to fix these issues in an ad-hoc (one-by-one) manner, but ultimately they are a result of an API limitation in QGIS and cannot be completely solved without re-working the associated API.
Proposed Solution
In order to fix the underlying causes of these issues, it will be necessary to rework the QGIS API relating to handling default value clauses.
Currently, default values are handled in QGIS by:
QgsVectorDataProvider::defaultValueClause(int fieldIndex)
)QgsVectorDataProvider::defaultValueClause
. If the values match, then NO value is specified for that column when generating the SQL to insert the new feature (or update the existing one), so that the backend database will correctly use the default value clause to generate a value.The problem with the current approach is that there is way to determine if an attribute for a particular feature has a value set OR should take the backend's default value for that column WITHOUT querying the vector data provider. This is not always possible, e.g. when a feature has been "detached" from its original layer via a copy/paste operation or other similar operation.
In order to resolve this, the following new API is proposed:
QgsUnsetAttributeValue
. Initially this class will contain a single string member, which will contain the original provider backend's default value clause. (This member is for presentation/debugging and API compatiblity purposes only, and has no further use in the API.)QgsVectorDataProvider::defaultValueClause
value, utility methods in QgsAttributes and QgsFeature will be introduced which test whether the attribute is set by checking whether the variant type stored for the attribute matches the QgsUnsetAttributeValue type.This new API will allow QgsFeature objects to be passed freely around all different parts of QGIS, from layer to layer, without risk of confusing a default value clause with a literal string value.
API Compatibility
Currently, plugins or scripts may be using the following method to test whether a particular attribute from a feature matches the default value clause for a provider:
In order to avoid breaking this code, QgsUnsetAttributeValue will have an
bool operator==( const QString& value ) const;
operator which can test for equality between the QgsUnsetAttributeValue and a string value. This operator will test whether the string value matches the QgsUnsetAttributeValue stored provider default value clauses member (as described earlier).Relationship to Qt 6/Python for QT work
An ongoing issue with the current PyQGIS "sip" based bindings is handling of NULL vs unset attribute values. This is currently handled by a PyQGIS specific special "NULL" value, which is used like this:
as opposed to:
This distinction causes many issues in QGIS code, and discussion regarding the QGIS 4.0 API changes have leaned towards removal of this distinction and treating NULL/not set values both as Python None values. This is also a necessary step toward possible future replacement of the "sip" based Python bindings with "Python for Qt" based bindings.
This proposal will help alleviate this issue, by making the distinction between Null vs "not set" attributes clearer. For QGIS 4.0, "Null" attributes can then always be treated identically to Python None values, and "unset" attributes stored/treated as QgsUnsetAttributeValue values.
Deliverables