Artoria2e5 / PRCoords

Public Domain library for rectifying Chinese coordinates
https://artoria2e5.github.io/PRCoords/demo.html
GNU General Public License v3.0
151 stars 23 forks source link
bd-09 bd09 china cross-language eviltransform gcj-02 gcj02 gis

PRCoords

People's Rectified Coordinates (PRCoords) is a cross-language implementation of "public secret" Chinese coordinate obfuscation methods including GCJ-02 and BD-09, along with general deobfuscation methods previously established in ChinaMapShift, eviltransform, and geoChina. (Referring to the process of replacing straight lines with wavy ones as a "transform" is euphemism overdone.)

For a background on China's geographic obfuscation, see Restrictions on geographic data in China and 中华人民共和国测绘限制 on Wikipedia.

Languages

(should I split them into submodules?)

For languages not yet supported, we recommend you to check for eviltransform (MIT) or geoChina (GPLv3, R) instead.

API

PRCoord's APIs operate on, and returns, dedicated structures for coordinates. In API names, we generally refer to WGS-84 as wgs, GCJ-02 as gcj, and BD-09 (lat-lon) as bd.

Inverse functions

The obfuscations generally have these properties to maintain basic usefulness:

  1. obfs(coord) is sort of close to coord.
  2. obfs(a) - obfs(b) is usually close to a - b. (The closer a and b are to each other, the better it works.)

In general two approaches of inverting the "forward" obfuscations, or working from obfs(coord) to coord, are implemented:

You can read on the demo page about how well these methods work from the ΔRoundtrip entry. Unless you are doing archival work, you generally don't have to iterate.

The "in China" sanity check

Typically PRCoords is only supposed to be ran on obfuscated input data, which are primarily Chinese coordinates. For this reason, initial implementations include this very very rough sanity check that spans a rectangular region on a mercator-projected map. This check can be overridden by passing a boolean value, or may be not at all implemented in certain languages if I am not in the right mood for doing silly things.

There is an "insane" sanity check intended to approximate the range of Google and Baidu's distortion, intended for use by IITC: js/insane_is_in_china.js. It is basically a ray-casting polygon check with 70 vertices. You, as the caller, should still be responsible for telling whether a point is part of the gov-screwed Chinese data.

FAQ

Why another wheel?

Can the systems be described as WKT or proj-strings?

Not directly as a datum, because in both representations a datum is either "sane" (no non-linearity in 3D, Helmert possible) or a big table of grids.

It should be possible to describe the two CS with a PROJECTION entry as a PROJCS. Since a PROJCS cannot be nested in another, the BD transformation must be described using WGS84 and a fuzed GCJ-BD projection. The situation is similar with Baidu "Meractor".

Speculative WKT/PROJ4 ```js PROJCS["Baidu 2009, Pseudo-Mercator", GEOGCS["WGS 84", DATUM["WGS_1984", SPHEROID["WGS 84",6378137,298.257223563, AUTHORITY["EPSG","7030"]], AUTHORITY["EPSG","6326"]], PRIMEM["Greenwich",0, AUTHORITY["EPSG","8901"]], UNIT["degree",0.0174532925199433, AUTHORITY["EPSG","9122"]], AUTHORITY["EPSG","4326"]], PROJECTION["CN_Obfs_Baidu_2009_Mercator"], AXIS["x",east], AXIS["y",north], UNIT["metre",1, AUTHORITY["EPSG","9001"]], EXTENSION["PROJ4","+proj=baidumerc +units=m +nadgrids=@null +wktext +no_defs"], AUTHORITY["EPSG","888002"]] PROJCS["Chinese BSM 2002, Pseudo-Ellipsoidal", GEOGCS["WGS 84", AUTHORITY["EPSG","4326"]], PROJECTION["CN_Obfs_GCJ_2002_Ellipsoidal"], AXIS["longitude",east], AXIS["latitude",north], UNIT["degree",0.0174532925199433, AUTHORITY["EPSG","9122"]], EXTENSION["PROJ4","+proj=gcjlonglat +units=deg +nadgrids=@null +wktext +no_defs"], AUTHORITY["EPSG","888000"]] ```

The good people at proj4js has made their stuff very easy to extend. Here is an example.

Should I use fast fp math?

Yes. Nobody knows what the original looks like anyways, so what's wrong with letting the compiler recombine a bit more? You can't be more off than the one-meter random error (in "EMQ") anyways.

Or tinker with 32-bit floats and fixed-point numbers. Or try approximation tools like Sollya or MC++. Really, just search on the Internet for "\<language> Taylor Chebyshev Model". You only need less than 1e-6 error on a not-very-large slice of the Earth anyways.

I threw TaylorModels.jl at GCJ-02, and got decent results out of it. Still too lazy to put it in code though. Check out approx/approx.ipynb. (Nope, not decent. Gotta do it properly some day, just don't use the notebook and expect it to work!)

I tried another route with the C++ version using a devmaster user Nick's sinpi() approximation. It seems to be good enough for 1e-6: check out cpp/bench_out and cpp/badmath.hh.

Physical PRCoords

You can print out a minimal copy of PRCoords with this PDF file. I am working on some better options in issue #2. A fairly simple tote bag with an older version of the PDF is available from Teespring.

Feel free to print and sell t-shirts with the PDF file! It is put in the Public Domain, so you don't have to pay me for that. You can always fund my subversive activities on Patreon though.

License

Unless otherwise mentioned, all files in this package, including this README file, are dual-licensed under:

GPL is only included for fun here.

Sources

See also

Oh, and finally, here is an official news report on that particular [bleep] who came up with GCJ-02.