nyalldawson commented 7 years ago

QGIS Enhancement: Rework datum transform handling

Date 2017/08/15

Author Nyall Dawson (@nyalldawson) (also @m-kuhn, @3nids, @wonder-sk, @jef-n, @andreasneumann )

Contact nyall.dawson@gmail.com

Version QGIS 3.0

Summary

While QGIS 2.x includes some support for handling datum transforms when reprojecting, the current support has some limitations:

Use of datum transforms is "opt-in" in code, and only used in a few select code paths (such as map rendering). Most coordinate transforms were never adapted to utilise datum transforms, and the current API for handling transforms is too cumbersome to allow widespread code modification to support the transforms.
The UX for datum transform selection is confusing and not user friendly.

While QGIS suffers these shortcomings, demand for robust datum transform support is increasing due to countries with large QGIS userbases implementing new projection systems which require datum transformations. For 3.0 we need a reworked datum transform API which forces users of coordinate transforms to respect the user's datum transformation settings (i.e. required vs opt-in) and an improved UX which guides users through correctly setting up datum transformations inside QGIS.

Additionally, we need the API to handle the looming introduction of temporal dependant datum transforms (such as GDA2020 for Australia).

This was discussed at the recent QGIS hackfest, with the following proposal sketched out:

Proposed Solution

Datum transforms will be set uniquely for each pair of source -> destination CRS within each project. This is a change from 2.x, where datum transforms can be set on a "per-layer" level (e.g. within a single project a unique combination of source->dest CRS can have different datum transforms per layer in the project). Handling this case requires a much more complicated API, and no strong use case could be determined which requires maintaining this facility. Datum transform setup must be done within projects to allow projects to be seamlessly shared between QGIS instances (and between client/server) without loss of datum transform settings.
Users can pre-define a set of "default transforms" via the global QGIS options. These transforms would be automatically inherited on creation of a new project within QGIS. This allows users to setup once in advance a number of common source->dest CRS pairs and the transform which should be used for that conversion, and have confidence that future projects will correctly handle these transforms. Additionally, use of the new user profiles/shared settings support in 3.0 will allow organisations to predefine these common transforms for their users.
A new QgsCoordinateTransformContext class will be created. Initially this class will only contain a map for source/destination CRS pairs to the corresponding datum transforms to be utilised for this combination. In future (when more details about the proposed architecture is available) support for dynamic temporal based transforms can be added to QgsCoordinateTransformContext - e.g. a "transform datetime" member for specifying the desired datetime which the transform should target.
QgsProject will gain a QgsCoordinateTransformContext member, with public const/non-const reference getters.
QgsCoordinateTransform constructor will be modified from the current

QgsCoordinateTransform( const QgsCoordinateReferenceSystem &source, const QgsCoordinateReferenceSystem &destination )

signature to instead include a compulsory QgsCoordinateTransformContext argument:

QgsCoordinateTransform( const QgsCoordinateReferenceSystem &source, const QgsCoordinateReferenceSystem &destination, const QgsCoordinateTransformContext& context )

This forces users of the QgsCoordinateTransform class to handle correct datum transform by requiring them to specify this QgsCoordinateTransformContext argument before performing and coordinate conversion. The QgsCoordinateTransformContext argument will be used to preset the source and dest datum transforms for the coordinate transform automatically (these will still be overwritable using setSourceDatumTransform and setDestDatumTransform if explicitly required).

To simplify construction of QgsCoordinateTransform, an overload which accepts a QgsProject argument instead of the QgsCoordinateTransformContext argument will also be added:

QgsCoordinateTransform( const QgsCoordinateReferenceSystem &source, const QgsCoordinateReferenceSystem &destination, const QgsProject* project )

This constructor will automatically retrieve the transform context from the project before using the same code paths as the QgsCoordinateTransformContext argument constructor - no permanent reference to the project will be stored.

This addition is designed to allow easier use of QgsCoordinateTransform API for PyQGIS users - plugins can resort to the 2.x style 'hack' of relying on the project instance ct = QgsCoordinateTransform(source_crs, dest_crs, QgsProject.instance() ) . While use of the project instance is discouraged from master code (and blocked from new inclusions in the core library), it will not be removed from the 3.0 api. This means plugins and scripts converting from 2.x api will only need to add the additional QgsProject.instance() argument to QgsCoordinateTransform constructors to be sure of correct handling of datum transforms in their code.

All QgsCoordinateTransform constructors throughout the master code will be upgraded to the new API, using the correct project reference wherever possible in core (and falling back to QgsProject::instance() for gui/app)
No compatibility will be given to upgrading 2.x project datum transform settings. Instead users are required to setup datum transforms in their existing projects using the new "single transform for each src/dest crs" approach. We will make mention in the release notes of this requirement.
Appropriate unit tests will be added covering all core changes.

Affected Files

Multiple!

Further Considerations/Improvements

(optional)

Backwards Compatibility

None - and 2.x projects will require manual user modification to re-setup datum transformations.

Issue Tracking ID(s)

https://issues.qgis.org/issues/16846

Votes

(required)

wonder-sk commented 7 years ago

Sounds like a good summary of what we have discussed at the hackfest! Big +1 for consistent handling of datum shifts everywhere in QGIS.

Maybe one more piece in the puzzle will be ensuring that if there are changes in the global/project datum shifts, these changes get propagated to any long-living coordinate transform instances (not sure if there are any?)

3nids commented 7 years ago

Datum transformation

thanks a lot for setting this up. As Matthias pointed out, Proj4 reprojects systematically data via the WGS84 ellipsoid. There is an explanation. This means that Datum shifts can be associated to a single CRS and not a couple. For the Swiss use case:

only the code of the old projection contains the magical setting nadgrids + = chenyx06etrs.gsb that actually allows you to use the transformation grid (this gsb file,precisely). Switzerland’s new projection being compatible with it, the loss of precision is due to the shift to the WGS84 ellipsoid.

http://www.camptocamp.com/en/actualite/mapserver-and-gdalogr-with-proj4-8-0-and-the-news-swiss-projections/

Now, this leaves us the choice of setting this per CRS or per pair of CRS. It might be more intuitive per pair, but that means one would need to duplicated as many times as one needs of output CRSs.

API

API wise, do you consider having the method with project instance to be available in Python only? To avoid its use in the app? Then we could just write some SIP code to do the QgsProject::instance() call and keep the current signature as is. Only plugins for server would have to be careful with this. What do you think?

nyalldawson commented 7 years ago

Now, this leaves us the choice of setting this per CRS or per pair of CRS. It might be more intuitive per pair, but that means one would need to duplicated as many times as one needs of output CRSs.

Unless I'm misunderstanding (which is highly likely - given that we're talking about datum transforms, which no-one ever understands fully) you're wanting to set a transform on ALL projections coming from a set source CRS regardless of the dest CRS?

In that case, I'd say we add the logic in QgsCoordinateTransformContext. E.g.

QgsCoordinateTransformContext has a map/hash of src crs -> datum transform, something like QMap< QgsCoordinateReferenceSystem, int > mSourceDatumTransforms
It also has a map/hash for src/dest crs pairs: QMap< QPair< QgsCoordinateReferenceSystem, QgsCoordinateReferenceSystem> > mSourceDestDatumTransforms.
When constructing QgsCoordinateTransform, if the src/dest combo exists in the context's mSourceDestDatumTransforms, use that - if not, check mSourceDatumTransforms. I.e. the pairs take precendence.

Would this also apply for dest crs? e.g. any source crs going to a set dest crs would have a transform applied? If so, similar logic can apply.

API wise, do you consider having the method with project instance to be available in Python only? To avoid its use in the app?

No, i'd also see it used in core/gui/app, but just using the correct project pointer and not QgsProject::instance(). It's just a bit nicer then always having to do QgsCoordinateTransform ct( srcCrs, destCrs, project->coordinateTransformContext() )

Then we could just write some SIP code to do the QgsProject::instance() call and keep the current signature as is. Only plugins for server would have to be careful with this. What do you think?

I'd rather change the signature. I think it's good for plugin authors to be aware that these transforms exist and why the extra argument is required. Especially when temporal transforms become commonplace.

nyalldawson commented 7 years ago

Maybe one more piece in the puzzle will be ensuring that if there are changes in the global/project datum shifts, these changes get propagated to any long-living coordinate transform instances (not sure if there are any?)

We'll have to check that. I can't think of any offhand.

3nids commented 7 years ago

The datum shift as defined within PROJ4 allows to transform (back and forth) to WGS84 (although . So as far as I understand it, the concept of src/dest is meaningless. We would then just need the first map/hash you gave in point 1. Does it make sense?

Then how handling transformations within CRS on the same datum (not WGS84 but for which we have a datum shift)? In such scenario, I think it is more efficient to avoid going through a datum shift given as a grid, but instead using the "default" transformation from PROJ. But... do we care?

It would be nice if someone with a better understanding of datum could jump in. Maybe @pka @rouault @strk ?

nyalldawson commented 7 years ago

The datum shift as defined within PROJ4 allows to transform (back and forth) to WGS84 (although . So as far as I understand it, the concept of src/dest is meaningless. We would then just need the first map/hash you gave in point 1. Does it make sense?

But do we need a separate transform for when the crs is a dest (ie from wgs84 -> dest?) Or is the same grid always used regardless of direction?

It certainly simplifies things if we have a single crs/transform combination.

Then how handling transformations within CRS on the same datum (not WGS84 but for which we have a datum shift)?

Are you saying having the same CRS as both source/dest, but pushing the coordinates through a grid anyway? If so, does this ever happen? Seems odd to me. At least... until we hit the temporal factor that is! (But let's defer that discussion given that it's all basically theoretical for now and there's no concrete specs about how this will be handled)

3nids commented 7 years ago

But do we need a separate transform for when the crs is a dest (ie from wgs84 -> dest?) Or is the same grid always used regardless of direction?

yes, grid is the same. It can be used in both direction.

Are you saying having the same CRS as both source/dest, but pushing the coordinates through a grid anyway?

Not the same CRS but the same datum (i.e. ellipsoid + geoid + cinematic model + transformation to a worldwide reference system, WGS84 in our case with PROJ). You can have different CRS on the same datum, and transformation wouldn't need to go through the grid back and forth (using simpler mathematical model from default transformation in PRIOJ would be more efficient). But again, I think it's an edge case, and we should not worry. And I might be mis-interepreting the things here...

nyalldawson commented 7 years ago

Not the same CRS but the same datum (i.e. ellipsoid + geoid + cinematic model + transformation to a worldwide reference system, WGS84 in our case with PROJ). You can have different CRS on the same datum, and transformation wouldn't need to go through the grid back and forth (using simpler mathematical model from default transformation in PRIOJ would be more efficient). But again, I think it's an edge case, and we should not worry. And I might be mis-interepreting the things here...

Got you. I'd say edge case, which should really be handled in PROJ.

andreasneumann commented 7 years ago

@3nids @nyalldawson Sorry for joining the discussion late. Was on holidays.

Actually, in Switzerland there are two grids: one "CHENYX06a" directly from EPSG:21781 to EPSG:2056 and in the other direction and "CHENYX06_etrs" for the conversion to ETRS (the european reference system). This is necessary for combining data with our neighbour countries. Both grids are available at https://shop.swisstopo.admin.ch/en/products/geo_software/GIS_info - see also formulas at https://www.swisstopo.admin.ch/content/swisstopo-internet/en/topics/survey/reference-systems/_jcr_content/contentPar/tabs_copy/items/dokumente/tabPar/downloadlist/downloadItems/292_1485793819776.download/refsys_e.pdf

I don't think it is enough to have a single datum definition per CRS, we really need the combinations of source and destination.

Disclaimer: I am also not an expert on this topic, I may be wrong. But there is probably a reason why Swisstopo provides several grid shift files, and not just one.

@mhugent and @edigonzales - can you perhaps join the discussion?

jef-n commented 7 years ago

I also think the grids are tied to certain transformations (eg. BETA2007 used in Germany to DHDN to UTM). The EPSG database has a set of transformations (I think that's where most of srs.db's tbl_datum_transform orginates from) and the source and destination pairs are not unique:

sqlite> select source_crs_code, target_crs_code, count(*) from tbl_datum_transform
  group by source_crs_code,target_crs_code
  having count(*)>1
  order by count(*) desc limit 5;
4230|4326|35
4267|4326|29
4208|4326|23
4284|4326|17
4263|4326|16

3nids commented 7 years ago

@andreasneumann There is only one datum per CRS (otherwise the CRS would not be defined). The two grids are redundant, and I think that the ETRS is useless in the case of using PROJ.4. The ETRS grid allows to directly convert LV03 (21781) to ETRS89 (based on WGS80). But PROJ4 cannot do such direct conversion and always go through WGS84. Tranformation from WGS84 to ETRS89 doesn't require a grid, it has a direct formula see 1.4 in Swisstopo's document

@jef-n indeed grids are tied to transformations, as in the Swiss example. But the way PROJ handles this is different as everything goes through WGS84. So one has to use the proper grid.

Does it make sense?

rouault commented 7 years ago

Ah interesting topic...

The current behaviour of proj.4 to do datum transformation is indeed to attach a +nadgrids= or +towgs84= argument to the source and target SRS, using WGS84 as the pivot. [edited: see https://github.com/qgis/QGIS-Enhancement-Proposals/issues/100#issuecomment-324281978 for corrected workflow] For a given CRS, you can have several valid towgs84/grids to transform it to another datum (let's say the pivot WGS84) depending on the area of use. For example some ancient datum (let's say ED50 EPSG:4230) have per-country transformation (typically TOWGS84 parameters) to WGS84 / EPSG:4326 So a pair of (source CRS, target CRS) isn't enough to define a transformation. The area of use is also an important parameter. If you had a project with a shapefile of France in geodetic ED50 and another of Germany in geodetic ED50 and want to reproject to WGS84, you'd need to use 2 different transformations. And even for the same source CRS, target CRS, area of use, the EPSG database can propose different transforms in some cases I believe. But the user can pickup just the one that makes sense (I cannot really imagine that a situation where (source CRS, target CRS, area of use) would call for a different transform depending on the layer is a common one)

Also good to know: the TOWGS84 parameter which was part of WKT 1 spec is no longer part of WKT 2 spec. Said otherwise the transform from a CRS to another one isn't really part of the CRS definition.

All the above was with current proj.4... Now let's talk about the future ! proj.4 is undergoing changes (addition of a "pipeline" API, see https://www.fig.net/resources/proceedings/fig_proceedings/fig2017/ppt/iss6b/ISS6B_evers_knudsen_9156_ppt.pdf for a short introduction) to avoid WGS84 being the pivot, since that adds inaccuracies. There are some datums for which there's no grid to/from WGS84. For example the new Japan datum after the Fukushima earthquake has only a transform to the old Japan datum, and none to WGS84. So future proj.4 versions will allow to just specify a grid transfrom without requiring WGS84 at one of the end.

jef-n commented 7 years ago

@jef-n indeed grids are tied to transformations, as in the Swiss example. But the way PROJ handles this is different as everything goes through WGS84. So one has to use the proper grid.

AFAIK that's how NTv2 works. The values in the grid are in degrees. So it is first transformed into WGS84, the correction is applied and the result transformed to the other CRS.

giohappy commented 7 years ago

Does using nadgrids implicate using the WGS84 pivot? I thought NTv2 grids would make a direct transformation from source to dest CRS, isn'it?

Giovanni

rouault commented 7 years ago

Does using nadgrids implicate using the WGS84 pivot?

Yes

I thought NTv2 grids would make a direct transformation from source to dest CRS, isn'it?

Grids potentially allow to transform from datum A to datum B directly (like from the new Japan datum to the old Japan datum example I mentionned above), but to use grids, whatever their format, with proj.4 +nadgrids the grid must be from datum A to WGS84 (or something that you consider equivalent to WGS84, like ETRS89 in Europe, NAD83 in North Amerrca, although this approximation was good in 1984 but has now around one meter inaccuracy with continental plate drift)

rouault commented 7 years ago

@kbevers @busstoptaktik I don't want to distract you from other tasks, but if you have some time to read this thread about CRS transformations in QGIS, your lights would be appreciated

giohappy commented 7 years ago

Grids potentially allow to transform from datum A to datum B directly (like from the new Japan datum to the old Japan datum example I mentionned above), but to use grids, whatever their format, with proj.4 +nadgrids the grid must be from datum A to WGS84 (or something that you consider equivalent to WGS84, so ETRS89 in Europe, NAD83 in North Amerrica, although this approximation was good in 1984 but has now around one meter inaccuracy with continental plate drift)

Thanks Even. For example I have some NTv2 grids from Roma40 (an italian datum) to ED50. In this case I have a direct trasnformation from the two CRSs, where ED50 is my target. Do you mean that this grid wouldn't work with +nadrgrids? What's different from transforming to, e.g., ETRS89?

Giovanni

jef-n commented 7 years ago

@giohappy did you know that GDAL can open grids (and in turn QGIS can show them)? The NTv2 driver at least returns SRS_WKT_WGS84 for any grid.

rouault commented 7 years ago

Do you mean that this grid wouldn't work with +nadrgrids?

Yes, you can't use it with +nadgrids, at least not without cheating. You could probably cheat with something like "+proj=longlat +a=roma40_a +f=roma40_f +nadgrids=your_grid_from_roma40_to_ed50 +to +proj=longlat +ellps=intl +nadgrids=@null" (I didn't try !!!)

What's different from transforming to, e.g., ETRS89?

ETRS89 was defined to match WGS84 with submeter accuracy at the time where it was initialy defined, so you could say ETRS89==WGS84 at that time. But this is getting less and less true.

andreasneumann commented 7 years ago

Perhaps we can have a look at a concrete example:

Here is the grid-shift file titled "CHENYX06a.asc" from https://shop.swisstopo.admin.ch/en/products/geo_software/GIS_info

Here is the header:

.... VERSION NTv2.0
DATUM_F CH1903
DATUM_T CH1903+ MAJOR_F 6377397.155 MINOR_F 6356078.963 MAJOR_T 6377397.155 MINOR_T 6356078.963 SUB_NAMECHENyx06 .....

The header of the grid shift file says that this grid shift file is for going from CH1903 (EPSG:21781) to CH1903+ (EPSG:2056). I am confused - how would this grid shift file help for transformation to WGS84 when it is made to go directly from EPSG:21781 to EPSG:2056 or vice versa?

The question is: are really all transformations going through proj4 or not? Perhaps QGIS does something internal without proj4 in cases of datum transformations?

3nids commented 7 years ago

@andreasneumann CH1903+ is directly convertible to WGS84 via formulas since it's based on it, while CH1903 is not. Hence the grid is required from CH1903 to WGS84 (and vice versa) but not between CH1903+ and WGS84.

@rouault huge thanks for these explanations!

If you had a project with a shapefile of France in geodetic ED50 and another of Germany in geodetic ED50 and want to reproject to WGS84, you'd need to use 2 different transformations.

Yes. The idea is that one would use a single transformation per QGIS project (one would set a global default, but it would be a per project definition). As you pointed, it's an extremely sharp cutting edge case where one would need different transformations per combination in the same project. Let's skip this!

Considering the tight delays for API refactoring of QGIS 3, what do you suggest?

stick to current PROJ behavior, and store one transformation per CRS to go to WGS84
or save this on a couple of CRSs basis (which would make some redundancies, but I guess it's not a big deal as one won't have tens of output CRSs).

@nyalldawson what are your thoughts on how solving this?

andreasneumann commented 7 years ago

@3nids thanks for explanation. Makes sense.

rouault commented 7 years ago

Sorry, I was wrong on the proj.4 current logic. The grids express offsets in longitude, latitude, not in geocentric coordinate space. So the corrected flow is:

convert from UTM on DatumA to geodetic (long, lat) on DatumA
if nadgrids is set on source CRS, convert from geodetic on DatumA to geodetic on WGS84 using the grid, and then to geocentric (X, Y, Z relative to Earth center) on WGS84
or if towgs84 is set on source CRS, convert from geodetic on DatumA to geocentric on DatumA, and then apply towgs84 to go to geocentric on WGS84
if nadgrids is set on target CRS, convert from geocentric on WGS84 to geodetic on WGS84, and from geodetic on WGS84 to geodetic on DatumB using the (reversed) grid
or if towgs84 is set on target CRS, convert from geocentric on WGS84 to geocentric on DatumB using (convert from geocentric on WGS84) towgs84, and from geocentric on DatumB to geodetic on DatumB
convert from geodetic on DatumB to UTM on DatumB --> done !

giohappy commented 7 years ago

Thanks very much @rouault it wasn't so clear to me. Your last explanation makes sense. I suppose that your "cheat" trick could work, given that you're explicting the destination CRS (whit null grid). I will give it a try ;)

Giovanni

rouault commented 7 years ago

To be future proof, the API could allow for an optional QgsRectangle that would contain the area of use (bounding box in long/lat. Whatever the CRS of those long/lat is ! it doesn't really matter here !).

QgsCoordinateTransform( const QgsCoordinateReferenceSystem &source, const QgsCoordinateReferenceSystem &destination, const QgsCoordinateTransformContext& context, const QgsRectangle& areaOfUse = QgsRectangle())

For now this wouldn't be used. It looks to me that the way we build a proj.4 transformation from sourceCRS, targetCRS, context and areaOfUse is mostly an implementation detail that could be refined in a minor release. In a first time, we could have just the old naive way where the transform to WGS84 is associated with the CRS. And later, there could be an advanced setup for people who know what they are doing where you could define the pipeline for tuple (source CRS, target CRS), and possibly (source CRS, target CRS, area of use)

busstoptaktik commented 7 years ago

Without having any clue about QGIS internals, I tend to agree with @rouault: Basically, you need to know what you are doing when transforming between systems, and hence, you should select a specific transformation going directly from system A to system B.

However, for cases where meter (or perhaps decimeter) accuracy is sufficient, we may help a non-expert user by automatically selecting a somewhat appropriate solution.

There is a relevant remark in pj_obs_api.c on the master branch of http://github.com/OSGeo/proj.4:

/*****************************************************************************/
PJ  *proj_create_crs_to_crs (PJ_CONTEXT *ctx, const char *srid_from, const char *srid_to) {
/******************************************************************************
    Create a transformation pipeline between two known coordinate reference
    systems.

    srid_from and srid_to should be the value part of a +init=... parameter
    set, i.e. "epsg:25833" or "IGNF:AMST63". Any projection definition that is
    can be found in a init-file in PROJ_LIB is a valid input to this function.

    For now the function mimics the cs2cs app: An input and an output CRS is
    given and coordinates are transformed via a hub datum (WGS84). This
    transformation strategy is referred to as "early-binding" by the EPSG. The
    function can be extended to support "late-binding" transformations in the
    future without affecting users of the function.

    Example call:

        PJ *P = proj_create_crs_to_crs(0, "epsg:25832", "epsg:25833");

******************************************************************************/

i.e. currently proj_create_crs_to_crs takes the "classical PROJ.4 way" of constructing a pipeline (a concatenation of operations), using whatever masquerades as WGS84 as a pivot datum. But when time permits, we may implement a lookup in a dual indexed list to see whether we have a direct route from system A to system B, before defaulting to the bad old way.

In general, please take a look at pj_obs_api.c and proj.h, which implements and defines a next-generation API for PROJ.4, which is more suitable for use with the dynamic/kinematic reference frames, that have been prevalent globally for now almost 30 years.

Kristian Evers, @kbevers, made some remarks about this in a recent presentation at the QGIS developer workshop in Nødebo, a few weeks ago. You can find his slides over at https://docs.google.com/presentation/d/10N0SDKU3CeDu7qMAhDWHW6RTnmGMszybR3XKSqfntyc/edit#slide=id.p4

kbevers commented 7 years ago

@kbevers @busstoptaktik I don't want to distract you from other tasks, but if you have some time to read this thread about CRS transformations in QGIS, your lights would be appreciated

Let me try. I am glad that this is being discussed and I am happy to pitch in wherever I can. As mentioned above we are currently working on providing a new API for PROJ.4 which makes it easier to take care of many of the problems highlighted above. It is still a work in progress but there will soon be a new release where the new API will be available. It is for the most part already locked down for the next release, but further improvements will be made in the future.

The new PROJ.4 API makes it possible to do more complicated coordinate transformations than was ever possible with the old API, including temporal transformations. There is still not much documentation written for the new API, but I think the general idea is understable from the header file proj.h. I suggest everyone in this discussion take a good look at the new PROJ.4 API and read the paper @busstoptaktik and I published earlier this year. The paper gives a good overview of how the concept of "transformation pipelines" work.

The new PROJ.4 API focuses more on the path between to CRS's than the source and destination CRS's. That is a crucial difference. Conceptually this is harder to grasp, but in turn it allows more precise transformations and makes it possible to do transformations that are not defined in turns of WGS84. The API provides a function called proj_create_crs_to_crs which sets up a transformation object between for instance to different CRS's in the EPSG-dataset. For now it uses WGS84 as a hub datum, but in a later release I plan to change that and have it consult the EPSG-database for a more direct path between source and destination. This will all be done behind the scenes and will not affect the public API.

My recommendation is to use the new PROJ.4 API in QGIS. Delegate as much work as possible to PROJ.4. This way all the geodetic problems already highlighted in this proposal can be solved. Some of them already are, for instance the temporal component.

These were just some quick thoughts on top of my head. I see that Thomas has also answered this while I've been typing. Sorry for any redundancies :-) I am happy to answer any questions you may have.

mhugent commented 7 years ago

@rouault: The current datum transformation in QGIS already uses +nadgrids=/+nadgrids=@null to directly go from one datum to the other. Not sure however about the internals of proj4, I thought that it should not go through wgs84 (unless the grid shift file transforms to wgs84 of course)

busstoptaktik commented 7 years ago

@mhugent: Your statement essentially says what QGIS does, while you on the other hand state that you do not know about PROJ.4 internals.

Does this mean that QGIS does datum shifts using dedicated non-PROJ.4 code?

mhugent commented 7 years ago

@busstoptaktik: QGIS uses only proj4 for projection and datum transform. There is no dedicated non-PROJ.4 code

rouault commented 7 years ago

@busstoptaktik @mhugent refered to the https://github.com/qgis/QGIS/blob/master/src/core/qgscoordinatetransform_p.cpp#L290 code that is the logic to build the the proj.4 string

giohappy commented 7 years ago

@rouault @mhugent this is essentially the "cheating" trick you described yesterday, isn'it?

kbevers commented 7 years ago

Grids potentially allow to transform from datum A to datum B directly (like from the new Japan datum to the old Japan datum example I mentionned above), but to use grids, whatever their format, with proj.4 +nadgrids the grid must be from datum A to WGS84 (or something that you consider equivalent to WGS84, like ETRS89 in Europe, NAD83 in North Amerrca, although this approximation was good in 1984 but has now around one meter inaccuracy with continental plate drift)

This can be done in the next release of PROJ.4 without the hassle of the WGS84 pivot datum by utilizing the new API and transformation pipelines. Lets say you have UTM coordinates in the old japanese datum and want to transform them to UTM coordinates referenced to the new datum, it can be done like this:

+proj=pipeline +step
+proj=utm +zone=54 +inv +step
+proj=hgridshift +grids=japan_old_to_new.gsb +step
+proj=utm +zone=54

So here the transformation is done directly without going through WGS84. So with the new PROJ.4 version it is not necessary to "cheat" to do this type of transformation. Note that the transformation pipeline above is just one proj-string composed of three individual operations. This is the essence of transformation pipelines.

nyalldawson commented 7 years ago

Great to see this being discussed so widely!

The most pressing thing we need to decide on here is how to expose this to the QGIS API - the conversion to the PROJ pipeline API can be handled later (without breaking the frozen QGIS 3.0 API)

Based on the discussed points, I'm proposing that:

A QgsCoordinateTransformContext (and by association, QgsProject) stores datum transforms by a combination of source CRS, dest CRS AND an optional map extent.
When transforming coordinates, we use this logic to decide what datum transforms to use:
- If the context has a transform with matching (valid) source and destination CRSes:
  - Check through all these transforms, looking for one with a map extent in the context corresponding to the source point location OR one with an invalid/null extent (prioritising valid, matching extents)
  - If none found, look for any transforms with matching source CRS where the dest CRS in the context is invalid (prioritising those with valid, matching extents)
    - If none found, look for any transforms with matching dest CRS where the source CRS in the context is invalid (prioritising those with valid, matching extents)
    - if none found, look for any transforms with invalid source/dest CRS in the context (prioritising those with valid, matching extents)

This basically means that projects can have AT most one datum transform for each pair of source/dest CRS per extent (i.e. different transforms with overlapping extent and with the same source/dest CRS in a single project is NOT supported)

Would this approach meet all requirements?

busstoptaktik commented 7 years ago

This basically means that projects can have AT most one datum transform for each pair of source/dest CRS per extent (i.e. different transforms with overlapping extent and with the same source/dest CRS in a single project is NOT supported)

Would this approach meet all requirements?

I do not think so - a potential counterexample would be grids (or Helmert transformations) going from national realizations of European Datum 1950 (ED50) to ETRS89. Since most European countries are not particularly box shaped, an area definition based on bounding boxes would result in overlapping areas.

Since historically, datums have been defined/realized nationally, we need to be able to specify regions as jurisdictions as well.

History is messy! But with global observation systems, collaboration and huge computational efforts, modern reference frames can be establised on global scale.

But Earth itself is messy too: Continental drift and crustal deformation make terrestrial reference frames inherently complex, and we should probably not expect to be able to handle everything automatically, but rather provide means for those who know what they do, to do what they need. Note that this will imply supporting handling the time of observations for data in QGIS, i.e. going spatio-temporal. With the growing need for accuracy, this will become necessary sooner than most of us expect.

The comittee draft of the upcomming revision of ISO19111 (Geographic Information - Referencing by Coordinates) introduces the concept of "datum ensembles", for datums/reference frames that can be considered equivalent, given accuracy needs are modest. This suggests a way to go: If only the generic datum name is given (e.g. ED50), use whatever comes first. If the datum name is qualified by a region name (e.g. ED50(DK), or DK:ED50), use the region specific version.

But even this will not solve everything automagically - this is a hard, hard problem, and as the need for accuracy grows, the hard parts expand their region of influence from a few handfuls of geodesists in national mapping agencies, and universities, to a larger group of everyday practitioners in a number of industries.

@kbevers and I try to provide means for those practitioners to handle this through PROJ.4, but while we can provide the nuts and bolts needed to construct the datum transformation machinery, we can never guess what the user means - especially not if the user is either unsure or indifferent.

3nids commented 7 years ago

@busstoptaktik thanks again for your contribution.

I do not think so - a potential counterexample would be grids (or Helmert transformations) going from national realizations of European Datum 1950 (ED50) to ETRS89. Since most European countries are not particularly box shaped, an area definition based on bounding boxes would result in overlapping areas.

Doesn't this mean that source CRS are not the same (although they share the same datum)? In such case, why having a single definition per project is not sufficient?

busstoptaktik commented 7 years ago

@3nids can you elaborate a bit? I do not understand what you mean.

3nids commented 7 years ago

If I understand correctly, we define the transformation from one CRS to another and not from datum to another one. So if we have a grid, it would be used for a defined combination of CRSs (possibly several combinations) and it's up to the user to define this. Hence I thought that the extent is implicit but reading again the thread it is apparently not. This goes a bit (too much) beyond my understanding, sorry for the noise.

nyalldawson commented 7 years ago

@3nids @busstoptaktik do we have a consensus here? Time is running out to get this implemented for the 3.0 api, so i'd like to conclude this discussion quickly and get a proposal updated to reflect the outcomes and new design.

3nids commented 7 years ago

I am a bit confused here. I would need confirmation @busstoptaktik for the distinction between datum and CRS. We're speaking of going from CRS to CRS and not datum to datum. Hence I believe your approach of (source CRS, dest CRS, extent) is sufficient. What am I missing?

nyalldawson commented 7 years ago

I don't think you're missing anything - I just pinged you since you were involved in the final discussion points on this thread.

3nids commented 7 years ago

@nyalldawson do you have a somehow clear vision of the topic? and/or feel confident with your proposal?

busstoptaktik commented 7 years ago

@nyalldawson - whichever way you decide to design a new transformation API, please be aware that

You need to pass full 4D data to the underlying transformations to make this work in the generic reference case (obviously, in many cases setting height=0 and observation time=start-of-GPS-time is the proper thing to do, and in many cases h & t are constants for each data set)
Regions will mostly (for any INSTANT in history) be non-overlapping, but not necessarily "rectangular in lat/lon space". You may need to support representation of (e.g.) ISO-3166 country codes, to help the user by making an automatic selection of the most appropriate transformation parameters/grids
For the cases where the user knows better, let the user specify exact reference frame names

rouault commented 7 years ago

My comment is largely included in @busstoptaktik above one:

We had a discussion / try on the proj.4 side of things per https://github.com/OSGeo/proj.4/pull/565 to enhance the proj_transform_crs_to_crs() function, but it appears that even a bbox added to (source CRS, dest CRS) isn't always sufficient to get an unambiguous transform because of the complex shapes of countries.

The example given in https://github.com/OSGeo/proj.4/pull/565#pullrequestreview-61734777 might be source CRS is ETRS89 EPSG:4258 and dest CRS is WGS84 EPSG:4326 and you work on Danish islands ( see https://www.openstreetmap.org/#map=8/56.511/11.063 ) or Swedish islands ( see https://www.openstreetmap.org/#map=11/55.1369/11.1504 ) that are both included in the bbox of Danmark and Sweden, but you need different shifts if it in Danmark or Sweden. So a transform is governed by a mix of geography and politics.

If you get a dataset/layer per country then you could need a different transform per layer.

So perhaps QgsCoordinateReferenceSystem woud need to be extended to accept a country / region / juridiction name. And then you could possibly customize a transform for a 4-uplet (source_crs_code, source_crs_juridiction_name, target_crs_code, target_crs_juridiction_name), with a special 'Default' juridiction name. Quite backwards regarding the aim of this QEP to simplify the UX :-(

kbevers commented 7 years ago

A few additional comments:

1. Coordinate Reference Systems

We're speaking of going from CRS to CRS and not datum to datum

A Coordinate Reference System, or CRS is generally comprised of both a coordinate system and a datum, so when you are speaking of going from one CRS to another CRS your are inherently also speaking of going from one datum to another. In many cases the datum, or reference frame, will be the same, but you have to consider the general case in this application. I like this figure from the EPSG Guidance Note 7-1 that schematically explains how coordinates in two different CRS's relate:

The box in the middle is what PROJ.4 is taking care of.

2. Make sure you your new API is compatible with the new PROJ.4 API

As mentioned previously in this thread, there's a new API for PROJ.4 on the way, which will be able to handle datum shifts a lot better in the future. I have set up a webpage with some preliminary documentation that should give you an idea of what is to come. Consider the functions proj_create_crs_to_crs and proj_transform before you freeze your new API.

3nids commented 7 years ago

Thank you very much for these details.

The original idea was that transformations should be configured project wise in QGIS (and not system/application-wise). Also QGIS should not be asking each time which transformation it should use. If we go that road, I think there is no need for country-code neither extent as only one conversion will be possible in a project. i.e. you'll work either with Swedish or Danish (data), but not both at the same time.

Now, if I understand correctly, we could define transformations application-wise and therefore additional info to determine the proper transformation (optional extent + optional country code + optional epoch) is required. And finally, layers in QGIS should also get a country code for this for required case. And epoch would be either layer property or defined in data.

Shall we put epoch on the side for now?
Should we define transformations application or project wise? Application seems to be the way to go, but this might/should be duplicated in the project to ensure coherence when sharing projects.

giohappy commented 7 years ago

From the user perspective please let's keep it simple for the generic user. Having transformations for country and epoch shouldn't require the user to choose between too many options if he doesn't need such specific and precise transformtions.

I hope this discussion will lead to more powerful tools, though without increasing the complecity for the average user.

kbevers commented 7 years ago

Shall we put epoch on the side for now?

Don't worry too much about, just make sure you don't shoot yourself in the foot and make it impossible to introduce at a later time. My advice is to think of all coordinates as 4-dimensional. You will need this before you realize. Leave the t-component empty for now.

Should we define transformations application or project wise? Application seems to be the way to go, but this might/should be duplicated in the project to ensure coherence when sharing projects.

Transformations will have to be dataset/layer-wise. Do you in fact mean application/project CRS that various datasources are transformed to? In that case, either is fine for my.

I think the best solution would be to let QGIS have two overall "modes" of transformation: "Basic" and "advanced". "Basic" should the default that will satisfy most users needs and essentially mimic what QGIS does today (just select the most likely transformation). "Advanced" should present the user for the various different transformation options that might be possible between two given CRS's.

At this point it is of course all very theoretical, but in the future it is my intention to have the proj_transform_crs_to_crs function query the EPSG database when settig up the transformation. A method of signaling that there is several transformation option can be created, which QGIS could benefit from in case "advanced mode" is on. Does that make sense?

3nids commented 7 years ago

"Advanced" should present the user for the various different transformation options that might be possible between two given CRS's.

That is not very possible to ask the user every time both in terms of UX (transformations might happen on mouse move over the map) and code structure (most of the transformations happen in core, while asking would end up in gui/app). The idea was to let the user define a project setting for a combination (hence my statement that you could not work with Swedish and Danish islands at the same time in their national CRS while the project would be in whatever other CRS). And to fallback to default if nothing is defined on the project. Then if PROJ.4 comes with better defaults at the end it's a bonus.

At this point it is of course all very theoretical, but in the future it is my intention to have the proj_transform_crs_to_crs function query the EPSG database when settig up the transformation. A method of signaling that there is several transformation option can be created, which QGIS could benefit from in case "advanced mode" is on. Does that make sense?

a lot! but again the point is that it's not that easy to bring the choice to the user when you are actually doing the transform. They need to be pre-defined in the project once for all.

@nyalldawson what about going as CRS / optional extent / optional country code for source and destination. We do as we thought as first and define this project wise (with application wise defaults). Country codes might be added as property of layers.

note to self: handle the case of missing grid when sharing projects

kbevers commented 7 years ago

a lot! but again the point is that it's not that easy to bring the choice to the user when you are actually doing the transform. They need to be pre-defined in the project once for all.

I am not sure my point came across, so I'll try to expand a bit. Of course, the user determines an overall CRS they want to work with in the project/application. When adding a new dataset, if you are in advanced mode and more than one possible path from layer CRS to project CRS exist, you'll be asked to choose which one you want. Or it could be a tab in the properties of a layer where the advanced user can select his preferred transformation definition (this way you could probably even do without modes...).

One thing that is important to keep in mind for coordinate transformation is that it is the path between CRS's that's important, not the CRS's themselves.

Unfortunately this stuff is not easy to get right, which is why users who require high accuracy coordinates will have to activate their brain if they want to be succesfull. My hope is that QGIS in the future will be able to facilitate that instead of "just" going with a default transformation that is "good enough" for most users.

qgis / QGIS-Enhancement-Proposals

Rework datum transform handling #100

QGIS Enhancement: Rework datum transform handling

Summary

Proposed Solution

Affected Files

Further Considerations/Improvements

Backwards Compatibility

Issue Tracking ID(s)

Votes