Closed vladikoff closed 3 years ago
If this was desktop I'd say "you've missing the /1.0/sync/1.5
suffix on your tokenserver URL" but I don't recall whether that's needed with the rust components.
If this was desktop I'd say "you've missing the /1.0/sync/1.5 suffix on your tokenserver URL" but I don't recall whether that's needed with the rust components.
I tried it with https://token.services.mozilla.com/
without the versioning business and it synced properly!
Omitting the custom sync server results in the same error:
sync15::token D Fetching token from https://token.stage.mozaws.net/1.0/sync/1.5
ResolverController E No valid NAT64 prefix (123, <unspecified>/0)
SELinux E avc: denied { find } for interface=vendor.qti.hardware.servicetracker::IServicetracker sid=u:r:system_server:s0 pid=1298 scontext=u:r:system_server:s0 tcontext=u:object_r:default_android_hwservice:s0
tclass=hwservice_manager permissive=0
E avc: denied { find } for interface=vendor.qti.hardware.servicetracker::IServicetracker sid=u:r:system_server:s0 pid=1298 scontext=u:r:system_server:s0 tcontext=u:object_r:default_android_hwservice:s0
tclass=hwservice_manager permissive=0
Icing I Indexing com.android.chrome-internal.3p:WebPage from com.android.chrome using seqno 93612
SELinux E avc: denied { find } for interface=vendor.qti.hardware.servicetracker::IServicetracker sid=u:r:system_server:s0 pid=1298 scontext=u:r:system_server:s0 tcontext=u:object_r:default_android_hwservice:s0
tclass=hwservice_manager permissive=0
ResolverController E No valid NAT64 prefix (123, <unspecified>/0)
SELinux E avc: denied { find } for interface=vendor.qti.hardware.servicetracker::IServicetracker sid=u:r:system_server:s0 pid=1298 scontext=u:r:system_server:s0 tcontext=u:object_r:default_android_hwservice:s0
tclass=hwservice_manager permissive=0
chatty I uid=1000(system) hwservicemanage identical 1 line
SELinux E avc: denied { find } for interface=vendor.qti.hardware.servicetracker::IServicetracker sid=u:r:system_server:s0 pid=1298 scontext=u:r:system_server:s0 tcontext=u:object_r:default_android_hwservice:s0
tclass=hwservice_manager permissive=0
sync15::sync_multiple D Declined engines list after state machine set to: Some([])
W sync failed: Missing server timestamp header in request, final status=OtherError
W Backtrace: None
sync_manager::manager I Sync finished with status OtherError
SyncWorker E 'Other' error :(
Does it work if you use the prod FxA server with the stage tokenserver?
No, doesn't work. End up in the circuit breaker.
sync15::token D Fetching token from https://token.stage.mozaws.net/1
.0/sync/1.5
ResolverController E No valid NAT64 prefix (123, <unspecified>/0)
sync15::token W Non-success status when fetching token: 401
sync15::sync_multiple D Declined engines list after state machine set to: So
me([])
W sync failed: HTTP status 401 when requesting a token
from the tokenserver, final status=AuthenticationEr
ror
So this sync failed: Missing server timestamp header in request
is very strange, and seems to indicate that you're getting a 200 OK
from tokenserver that somehow doesn't include the X-Timestamp
header. This should in theory be impossible (which is one of the reasons we error on it) because there's a "tween" in the tokenserver server code that sets this header on every response. So, I'm not entirely sure what's happening here.
Note that it may not be the tokenserver - sync15's client.rs has the same check as token.rs. The default logs should be tweaked because they aren't giving enough info, but, I see these logs in this order:
07-21 15:19:03.098 27623 27674 I sync15::sync_multiple: Advancing state machine to ready (full)
07-21 15:19:03.098 27623 27674 D sync15::token: Fetching token from https://token.stage.mozaws.net/1.0/sync/1.5
07-21 15:19:04.002 27623 27674 D sync15::sync_multiple: Declined engines list after state machine set to: Some([])
07-21 15:19:04.003 27623 27674 W sync15::sync_multiple: sync failed: Missing server timestamp header in request, final status=OtherError
If it was the token server responding without that header, it's not clear why we'd still log about 'Declined engines' in between.
(I'm not saying that it's not the token server - just that it might not be :)
The sync continues to fail on stage using Fenix and iOS. I’ve tried the following values for Custom Sync server and the result is the same:
Empty Custom Sync server field - logs: logcat - 1.txt
This issue is a real blocker for sync stage testing from a QA perspective.
sorry, I was looking at the oauth token fetch on one and sync token on the other. Both have the same headers, including the case.
Empty Custom Sync server field - logs: logcat - 1.txt https://token.stage.mozaws.net - logs: logcat - 2.txt
Both of these are showing the sync failed: Missing server timestamp header in request
. I tried to reproduce this using my own account on stage, but wasn't able to yet.
https://token.services.mozilla.com/ - logs: logcat - 3.txt
This log shows Fenix triggering its "auth retry loop circuit-breaker" and surfacing an error to the user, which I believe is expected when using the production sync stack with a stage FxA stack.
I was able to reproduce this issue using the vlad2@restmail.net
account; luckily I've seen enough of vlad's demos that I was able to guess the corresponding account password ;-)
With additional debugging logs:
2020-08-04 11:36:32.109 12267-12457/org.mozilla.fenix.debug D/sync15::token: TS HEADERS Headers { headers: [Header { name: HeaderName("content-length"), value: "752" }, Header { name: HeaderName("access-control-allow-methods"), value: "OPTIONS" }, Header { name: HeaderName("x-timestamp"), value: "1596504992" }, Header { name: HeaderName("access-control-allow-credentials"), value: "true" }, Header { name: HeaderName("access-control-max-age"), value: "1728000" }, Header { name: HeaderName("connection"), value: "keep-alive" }, Header { name: HeaderName("x-content-type-options"), value: "nosniff" }, Header { name: HeaderName("content-type"), value: "application/json" }, Header { name: HeaderName("date"), value: "04 Aug 2020 01:36:32 GMT" }, Header { name: HeaderName("access-control-allow-headers"), value: "X-Conditions-Accepted" }, Header { name: HeaderName("strict-transport-security"), value: "max-age=31536000" }] }
2020-08-04 11:36:32.110 12267-12457/org.mozilla.fenix.debug W/lla.fenix.debu: Accessing hidden method Lsun/misc/Unsafe;->putInt(Ljava/lang/Object;JI)V (greylist, linking, allowed)
2020-08-04 11:36:32.110 12267-12457/org.mozilla.fenix.debug W/lla.fenix.debu: Accessing hidden method Lsun/misc/Unsafe;->putObject(Ljava/lang/Object;JLjava/lang/Object;)V (greylist, linking, allowed)
2020-08-04 11:36:32.777 12267-12457/org.mozilla.fenix.debug W/lla.fenix.debu: Accessing hidden method Lsun/misc/Unsafe;->getObject(Ljava/lang/Object;J)Ljava/lang/Object; (greylist, linking, allowed)
2020-08-04 11:36:32.777 12267-12457/org.mozilla.fenix.debug W/lla.fenix.debu: Accessing hidden method Lsun/misc/Unsafe;->getObject(Ljava/lang/Object;J)Ljava/lang/Object; (greylist, linking, allowed)
2020-08-04 11:36:32.778 12267-12457/org.mozilla.fenix.debug D/sync15::client: SYNC HEADERS Headers { headers: [Header { name: HeaderName("x-firefox-spdy"), value: "h2" }, Header { name: HeaderName("access-control-allow-headers"), value: "X-Conditions-Accepted" }, Header { name: HeaderName("date"), value: "04 Aug 2020 01:36:33 GMT" }, Header { name: HeaderName("alt-svc"), value: "clear" }, Header { name: HeaderName("server"), value: "openresty/1.15.8.2" }, Header { name: HeaderName("via"), value: "1.1 google" }, Header { name: HeaderName("x-weave-timestamp"), value: "1596504993.16" }, Header { name: HeaderName("content-type"), value: "application/json" }, Header { name: HeaderName("access-control-allow-methods"), value: "OPTIONS" }, Header { name: HeaderName("access-control-max-age"), value: "1728000" }, Header { name: HeaderName("access-control-allow-credentials"), value: "true" }, Header { name: HeaderName("content-length"), value: "185" }] }
2020-08-04 11:36:32.778 12267-12457/org.mozilla.fenix.debug D/sync15::sync_multiple: Declined engines list after state machine set to: Some([])
2020-08-04 11:36:32.778 12267-12457/org.mozilla.fenix.debug W/sync15::sync_multiple: sync failed: Missing server timestamp header in request, final status=OtherError
So as @mhammond suggested, the error here actually appears to be in handling the response from the sync server. In this case it's looking for the x-last-modified
header which is not present in the request.
I think this is a bug with the durable sync storage node not returning X-Last-Modified
on some responses. It's not clear to me why we see this in stage but not production.
I was able to reproduce this issue using the vlad2@restmail.net account; luckily I've seen enough of vlad's demos that I was able to guess the corresponding account password ;-)
😄
I added some additional logging to find the failing request, and:
2020-08-04 11:54:01.932 15765-15943/org.mozilla.fenix.debug D/sync15::client: SYNC HEADERS for "/1.5/6384015/info/configuration" = Headers { headers: [Header { name: HeaderName("access-control-allow-headers"), value: "X-Conditions-Accepted" }, Header { name: HeaderName("alt-svc"), value: "clear" }, Header { name: HeaderName("x-weave-timestamp"), value: "1596506042.42" }, Header { name: HeaderName("x-firefox-spdy"), value: "h2" }, Header { name: HeaderName("access-control-allow-methods"), value: "OPTIONS" }, Header { name: HeaderName("server"), value: "openresty/1.15.8.2" }, Header { name: HeaderName("access-control-max-age"), value: "1728000" }, Header { name: HeaderName("content-length"), value: "185" }, Header { name: HeaderName("via"), value: "1.1 google" }, Header { name: HeaderName("date"), value: "04 Aug 2020 01:54:02 GMT" }, Header { name: HeaderName("access-control-allow-credentials"), value: "true" }, Header { name: HeaderName("content-type"), value: "application/json" }] }
I think it's totally reasonable for the /info/configuration
request to not return an X-Last-Modified
header, TBH. Let me see what other servers do in this case...
Here's the troublesome request, captured on desktop to get more info:
XHRGEThttps://stage.sync.nonprod.cloudops.mozgcp.net/1.5/6384015/info/configuration
[HTTP/2 200 OK 184ms]
GET
https://stage.sync.nonprod.cloudops.mozgcp.net/1.5/6384015/info/configuration
Status200
OK
VersionHTTP/2
Transferred685 B (185 B size)
access-control-allow-credentials
true
access-control-allow-headers
DNT,Keep-Alive,User-Agent,X-Requested-With,If-Modified-Since,Cache-Control,Content-Type,Authorization,X-Conditions-Accepted
access-control-allow-methods
GET, POST, OPTIONS
access-control-max-age
1728000
alt-svc
clear
content-length
185
content-type
application/json
date
Tue, 04 Aug 2020 02:01:31 GMT
server
openresty/1.15.8.2
via
1.1 google
X-Firefox-Spdy
h2
x-weave-timestamp
1596506491.81
For comparison, here's the request I see on my production sync account, which does include the X-Last-Modified
header:
XHRGEThttps://sync-1-us-west1-g.sync.services.mozilla.com/1.5/137697556/info/configuration
[HTTP/2 200 OK 201ms]
GET
https://sync-1-us-west1-g.sync.services.mozilla.com/1.5/137697556/info/configuration
Status200
OK
VersionHTTP/2
Transferred697 B (165 B size)
access-control-allow-credentials
true
access-control-allow-headers
DNT,Keep-Alive,User-Agent,X-Requested-With,If-Modified-Since,Cache-Control,Content-Type,Authorization,X-Conditions-Accepted
access-control-allow-methods
GET, POST, OPTIONS
access-control-max-age
1728000
alt-svc
clear
content-length
165
content-type
application/json
date
Tue, 04 Aug 2020 01:58:31 GMT
server
openresty/1.15.8.2
via
1.1 google
X-Firefox-Spdy
h2
x-last-modified
1596506183.96
x-weave-timestamp
1596506311.77
I also tried using a new account in production, thinking this might cause an error of Fenix is the first client to sync to a new store. But in that case the production service returns X-Last-Modified: 0.00
.
Stage sync storage claims to be running syncstorage-rs version 0.5.3, while prod claims to be running version 0.4.2. I can only assume this difference in behaviour is a relatively new change in the durable sync server code.
I filed https://github.com/mozilla-services/syncstorage-rs/issues/759 to fix this on the sync server.
I think we should also consider changing the client code here to avoid erroring out if the /info/configuration
response is missing X-Last-Modified
, because that resource doesn't really have a sensible "last modified" timestamp the way that user-uploaded data does. But given that we've got shipping clients that have the current behaviour, a server-side fix seems worthwhile.
Hi @rfk! I see that bug #759 closed, but still the issue on Android/iOS happening, should it be fixed right away or we need a a-s update? just wondering about the timing to have this fixed, thanks!
I think we need to fix to be deployed to https://stage.sync.nonprod.cloudops.mozgcp.net/, I'll ask over in the bug.
(This should not require any new client-side release in order to be fixed, we just need to get the right bits on the server)
The fix for mozilla-services/syncstorage-rs#759 is now deployed on both stage and production.
With this configuration in fenix:
<img src=https://user-images.githubusercontent.com/128755/87977298-872fff00-ca9c-11ea-82bc-222b3b70599f.png width=350 />
ADB Logcat says:
┆Issue is synchronized with this Jira Task ┆Epic: a-s consumer issues ┆Sprint: Backlog