Project-OSRM / node-osrm

DEPRECATED Part of osrm-backend since 5.7. NodeJS bindings for OSRM
BSD 2-Clause "Simplified" License
141 stars 48 forks source link

String escaping broken in v4.6.0 branch #83

Closed TheMarex closed 9 years ago

TheMarex commented 9 years ago

That is the geometry response from the osrm.route test on v4.6.0

  route_geometry: 'y~pdcBojfsXuBtD_KzQyEdJ|ElL|u@pb@jPdJn^rLhRjGbNdGnRrEg@xJwRdhE[pIi@jKyHv}AqEl}@m@vJu@nJ{@`HeAtGuDnOeEzLaHzM{BxE{BxHiAjIe@rIJfMbCnLlAhE~BhD`C|AbH`@fEcCtEoC`FeDfCgArFL~PxHdh@tW`n@`b@n\\\\hYhU`Rra@~`@x_An~@hSnPgJd^wFtT}Lte@kp@~mCkRbw@uKvd@cJz`@oC|OdNdCpOzBnZnEzH~@ha@zFbGf@rTaAxEbA`HxBtPpFpa@rO_Cv_B_ZlDaMlB',

That is the response from master using OSRM v4.5.0:

  route_geometry: 'y~pdcBojfsXuBtD_KzQyEdJ|ElL|u@pb@jPdJn^rLhRjGbNdGnRrEg@xJwRdhE[pIi@jKyHv}AqEl}@m@vJu@nJ{@`HeAtGuDnOeEzLaHzM{BxE{BxHiAjIe@rIJfMbCnLlAhE~BhD`C|AbH`@fEcCtEoC`FeDfCgArFL~PxHdh@tW`n@`b@n\\hYhU`Rra@~`@x_An~@hSnPgJd^wFtT}Lte@kp@~mCkRbw@uKvd@cJz`@oC|OdNdCpOzBnZnEzH~@ha@zFbGf@rTaAxEbA`HxBtPpFpa@rO_Cv_B_ZlDaMlB',

There is a subtle difference: instead of two "\" in the new response there are four (before the hYhU substring). So this suggests there is some sort of escaping happing somewhere?

The response form osrm-routed is fine btw.

DennisOSRM commented 9 years ago

OK, sounds like the string is escaped a second time. Adding a std::cout at the right spot shows that it's the NanNew() call that does the Escaping.

DennisOSRM commented 9 years ago

Found a little bit of background: http://www2.warrensburgr6.org/boundary/encoder/pitfalls.html

DennisOSRM commented 9 years ago

It looks like this:

The polyline algorithm produces output may contain backslashes. This output is usually stored in strings in which JavaScript wants to escape anything that's a (potential) control character. So, there's a some inconsistency. The inconvenient thing is that osrm-routed does not double quote the backslashes but it should do as JSON is essentially JavaScript.

Question is, where do we fix this in a consistent way?

updated list to reflect current findings, Thu, Feb, 26th

TheMarex commented 9 years ago

@DennisOSRM could not reproduce the incompatibility in polyline: https://github.com/mapbox/polyline/pull/7

In general the escaping should only happen if a string containing "\" is encoded as JSON. The JSON parsing should remove the double "\".

There seems to be some confusion between representation of a string in JavaScript and the actual value of the string in memory. What I would expect to happen when NanNew is called, that the string will have the actuall value of the const char*, not the value of the string if you would want to represent it in JavaScript source.

DennisOSRM commented 9 years ago

Yes, this inconsistent behavior is utterly confusing. I'd expect NanNew() to do the same thing. Do you know if we can tell NaN to treat string as a value instead of a literal (or vice versa).

jfirebaugh commented 9 years ago

I am very skeptical that NanNew does any escaping.

TheMarex commented 9 years ago

Yep @jfirebaugh is right. The actual escaping is happening in polyline_encoder.cpp:59 This works fine when you use osrm-routed because the actual JSON serializer in OSRM does not escape the strings again.

TheMarex commented 9 years ago

Fix is tracked in https://github.com/Project-OSRM/osrm-backend/pull/1397

DennisOSRM commented 9 years ago

I am very skeptical that NanNew does any escaping.

Well, NanNew() (or any function/c'tor it called) does escape each and every backslash in the input string.

Yep @jfirebaugh is right. The actual escaping is happening in polyline_encoder.cpp:59 This works fine when you use osrm-routed because the actual JSON serializer in OSRM does not escape the strings again.

Good findings on double escaping in polyline encoding and NanNew(). Running point on getting the fix into shape.

DennisOSRM commented 9 years ago

We should be good to close here.

cc: @TheMarex