parse-community / parse-server-push-adapter

A push notification adapter for Parse Server
https://parseplatform.org
MIT License
85 stars 100 forks source link

Status 400 and reason BadDeviceToken #87

Open halavins opened 7 years ago

halavins commented 7 years ago

I've updated the parse-server to the latest 2.5.3 version and parse-server-push-adapter v 2.0.0 and now my error logs are full of these messages:

3|parse-wr | node-pre-gyp
3|parse-wr |  
3|parse-wr | ERR!
3|parse-wr |  
3|parse-wr | parse-server-push-adapter APNS
3|parse-wr |  APNS error transmitting to device 28c89bdc6a0e7a6454fb0cc446229a0841eecdea955533d3c549a658bb15fc98 with status 400 and reason BadDeviceToken

There are so many of them that I can't see anything else.

The questions are: 1) How can I ignore these errors in my pm2 config file? 2) Is there a way to remove bad device tokens automatically from Installation Collection?

flovilmart commented 7 years ago

How can I ignore these errors in my pm2 config file?

There is no particular way to ignore those, they are produced when running with VERBOSE

Is there a way to remove bad device tokens automatically from Installation Collection?

For now there isn't

halavins commented 7 years ago

Hey @flovilmart, my VERBOSE setting is disabled, but my logs are filled with these errors by 90%.

How do I turn them off? The fact that servers show and write these errors damage the performance.

flovilmart commented 7 years ago

After looking into the repo, the error is reported by the logger as an error log, you won’t be able to disable it at that time.

miguel-s commented 7 years ago

Hi @flovilmart, we are running our parse server with logLevel: 'none', but we are still getting these errors logged since we upgraded to 2.5.3, is this intended?

halavins commented 7 years ago

Looks like they just forgot to set the right setting for these errors.

I've downgraded to an earlier version.

halavins commented 6 years ago

Any news on this issue? Is there a way to switch the Push errors off?

Or is there a way to automatically erase all Installations with Bad Device Tokens?

funkenstrahlen commented 6 years ago

As far as I know in the latest version there is a way to automatically delete bad tokens.

halavins commented 6 years ago

Do you know any adequate way to find the bad device tokens and erase them?

We have a bunch of them, and they only increase in numbers, especially the ones for Androids. They never self-delete even if the app is reinstalled.

smiley-yoyo commented 6 years ago

Hi @funkenstrahlen can you please tell the way to automatically delete bad tokens? thanks a lot.

I have figured it out. After release 2.6.3, you can automatically delete bad tokens by setting the PARSE_SERVER_CLEANUP_INVALID_INSTALLATIONS environment variable = 1. I have tested it on 2.6.5 and pass.

Still thanks a lot to @funkenstrahlen for giving the hint.

halavins commented 6 years ago

@wicked-wei , our tests for this PARSE_SERVER_CLEANUP_INVALID_INSTALLATIONS option showed that: 1) even the VALID iOS deviceTokens are erased for some reason 2) the old, invalid, unused android deviceTokens are not erased at all

smiley-yoyo commented 6 years ago

@halvini thanks for your comment. I haven't seen the problem you mentioned yet but I will keep eyes on it. And, the code that PARSE_SERVER_CLEANUP_INVALID_INSTALLATIONS option works is at parse-server/src/StatusHandler.js, you can check it out for more detail.

flovilmart commented 6 years ago

@halvini can you please point out what part of the logic there is not working as expected?

halavins commented 6 years ago

@flovilmart , the issue is described in this ticket: https://github.com/parse-community/Parse-SDK-iOS-OSX/issues/1224

We are 100% sure that PARSE_SERVER_CLEANUP_INVALID_INSTALLATIONS flag cleans up even VALID ios installations. Why? I don't know. I've checked the StatusHandler, I've checked the APNS.js. I don't see anything unusual out there, but I believe that when you send many many push notifications to different installations, probably APNS returns BadDeviceToken error even for valid installations. It doesn't happen immediately. For instance, my own valid installation was unset by this flag in a week after I turned it on. So, probably, if there is a timeout, or any other connection or delivery error, the APNS thinks that device is either Unregistered, or Bad.

smiley-yoyo commented 6 years ago

@halvini you can also check that if there is an unexpected mixed using of both apple sandbox and production push on a same environment(or database). This may cause unexpected deletion of deviceToken.

halavins commented 6 years ago

@wicked-wei the sandbox was used by 20 beta-testers only.

When we set PARSE_SERVER_CLEANUP_INVALID_INSTALLATIONS flag, nothing bad happened at first, but then in 3 days we started getting complaints from both sandbox users and production users. Why were the production users affected by this flag?

smiley-yoyo commented 6 years ago

@halvini because sandbox and production use different deviceToken. If any chance you send a production deviceToken to a sandbox apns server, it will report that token is BadToken. Reverse is the same.

halavins commented 6 years ago

@wicked-wei ok, so why did production users lose their deviceToken data, if we send push notifications from cloud code only? The parse-server config has both certificates installed:

    "PARSE_SERVER_PUSH": {
      "android": {
        "senderId": "*****",
        "apiKey": "*****"
      },
      "ios": [
        {
          "pfx": "*****/developmentPush.p12",
          "topic": "*****",
          "production": false
        },
        {
          "pfx": "*****/productionPush.p12",
          "topic": "****",
          "production": true
        }
     ]
    },
smiley-yoyo commented 6 years ago

@halvini the config looks ok, but I am not sure if parse server can tell a installation deviceToken is production or sandbox. For the js documents does not mention that, maybe it will send a token to both production or sandbox? I am not sure because I never do that. Oh, I forgot that. The installation has 'appIdentifier' field to tell deviceToken belongs app. So If you config two pfx with different topic, that will be Ok.

halavins commented 6 years ago

@wicked-wei , this is all quite informative, but it doesn't resolve the issue of undefined deviceTokens for valid ios installations.

Your idea is to change the development certificate to a new bundle id, am I right? How can I be sure that that's the reason? I can't risk my production environment again. I've lost too much data when I set the PARSE_SERVER_CLEANUP_INVALID_INSTALLATIONS flag. It worked for a few days only but resulted in a huge mess.

halavins commented 6 years ago

Did anyone find a solution to this issue?

I'd like to set the PARSE_SERVER_CLEANUP_INVALID_INSTALLATIONS to 1 without losing valid iOS installations.

alexblack commented 6 years ago

Any update here? is PARSE_SERVER_CLEANUP_INVALID_INSTALLATIONS safe to use? Are there docs on it?

jeffreyjackson commented 6 years ago

im interested in this as well. just for the sake of keeping logs clean

flovilmart commented 6 years ago

@alexblack @jeffreyjackson the only way to know if it's safe to use is for you guys to use it and report your findings.

alexblack commented 6 years ago

Well, there are some findings above, sounds risky, we'll find another way.

flovilmart commented 6 years ago

we'll find another way.

That's not really helpful as the gate behind the feature flag is for the community to experiment. I'd rather revert the PR / feature if no one is willing to experiment with us.

alexblack commented 6 years ago

Fair point. I should share my other feedback - also not too excited about the feature in general, if device tokens are erased then it seems to make it harder to distinguish installations with no device token vs ones that were erased.

This combined with the lack of real errors/transparency from parse push means we'll eventually just integrate with apple and google, then we'll have real errors back, and we can just mark installations as unreachable when we learn that from the source.

Also as we've reported parse-dashboard doesn't display past pushes for us. So we're not really having a great experience here.

flovilmart commented 6 years ago

also not too excited about the feature in general, if device tokens are erased then it seems to make it harder to distinguish installations with no device token vs ones that were erased.

This is what you were expecting originally, that this feature is working correctly, not sure where you're coming from with that.

This combined with the lack of real errors/transparency from parse push means we'll eventually just integrate with apple and google, then we'll have real errors back, and we can just mark installations as unreachable when we learn that from the source.

I'm open for improving the project / provide a better feature but without community feedback this is unfortunately impossible.

Opensource doesn't mean we the maintainers have to take all the risks and responsibility.

The community pressed to have a cleanup invalid token implemented, IIRC you were part of it, I spent a shit load of hours on it, a tentative implementation has been available for a while and now you're pressing again today to have something else implemented. Do you understand how unfair, unethical and unwelcome those commentaries are?

alexblack commented 6 years ago

Sorry. Just sharing my feedback. :(

alexblack commented 6 years ago

I don't remember pressing to have device tokens erased, but, if I did, and now I realized that having them not erased would be better, I apologize for my lack of foresight.

flovilmart commented 6 years ago

Any update here? is PARSE_SERVER_CLEANUP_INVALID_INSTALLATIONS safe to use? Are there docs on it?

@alexblack that's not what I call sharing feedback, but waiting-for-someone-else-to-do-some-work-for-me.

Sharing feedback would have been to open issues, discuss potential improvements and most importantly, identify if there's any issue with the approach. What would be the best solution for you?

Now, we gather all error reports from the adapter, and list the installations that fail. Removing their token is what's expected from GCM or APNs. What's missing?

alexblack commented 6 years ago

I put in some hours trying to make a PR in the past. I failed. That seems like more than waiting-for-someone-else-to-do-some-work-for-me.

flovilmart commented 6 years ago

If you have trouble opening a PR, we, the maintainers are here to help

alexblack commented 6 years ago

Now, we gather all error reports from the adapter, and list the installations that fail. Removing their token is what's expected from GCM or APNs. What's missing?

At the risk of being unethical and unfair, while this is a huge step forward (nice work), what ultimately I am looking for is:

  1. for installations to be marked unreachable when they are unreachable, then we can avoid sending pushes to them (and to be able to distinguish different cases like installations that never had a device token etc)
  2. to be able to see for any (push, installation) pair what the result was from the source (GCM, APN). eg did it succeed, did it fail, whats the error.
alexblack commented 6 years ago

If you have trouble opening a PR, we, the maintainers are here to help

You reviewed my PR many times. I'm not saying I did a great job, but I did put in some time, and we did chat. You gave feedback, my PR ultimately wasn't great.

Yet, what i've heard from you is that I do nothing but wait for you to do the work.

flovilmart commented 6 years ago

So from the current codebase, which contains 99% of the feature, and that logs the error, does it seems unachievable to add your 2 requirements?

alexblack commented 6 years ago
  1. seems like it'd be easy for (me, or someone else) to add. eg set a boolean installation.unreachable=true instead of erasing the device token, and then exclude installations with unreachable=true from having pushes sent

  2. I think what would be useful here is to have a new collection, push_installation, with columns push_id, installation_id, result

My pr was moving in that direction, and the sense I got from you was that this was not a direction you wanted to go.

Given that the errors are logged, I guess if they include the push id, and the installation id, then I could write something in our server to collect them, and put them into our own table, but it doesn't seem like a great solution.

flovilmart commented 6 years ago

My pr was moving in that direction, and the sense I got from you was that this was not a direction you wanted to go.

Yes we need to collect the errors in a _PushStatus object and not in another collection. Our role also as maintainers is to maintain YOUR code once we put it in, so we have to weigh in and our on those.

seems like it'd be easy for (me, or someone else) to add. eg set a boolean installation.unreachable=true instead of erasing the device token.

What would be the benefit over removing the deviceToken? Imagine the device token gets updated, what do we do with the unreachable? How do we handle conflicts if s/o already uses 'unreachable'? What if the installation is reachable but wrongly flagged?

alexblack commented 6 years ago

What would be the benefit over removing the deviceToken?

To distinguish these installations from ones which never had a deviceToken. Some of our installations do not have deviceTokens, we need to make sure that doesn't happen often, and that we have a way to fix it for our users in that situation.

alexblack commented 6 years ago

Yes we need to collect the errors in a _PushStatus object and not in another collection.

Makes sense. If it was a map in there of installationId -> result that would be very useful.

alexblack commented 6 years ago

Imagine the device token gets updated, what do we do with the unreachable?

It should be erased.

How do we handle conflicts if s/o already uses 'unreachable'?

Seems like a challenge for any new field, not just this one. I'm not sure what strategy you should use for adding/naming new fields, but, presumably at some point somewhere you'll have to add a new field.

What if the installation is reachable but wrongly flagged?

Seems like this issue happens with the existing solution too. Eg what if the installation is reachable, but you erased the device token.

At least with unreachable you could choose to re-attempt if you wanted. So it seems like unreachable is a much better solution in this scenario.

flovilmart commented 6 years ago

I guess if they include the push id, and the installation id

It should be as is https://github.com/parse-community/parse-server/blob/57da2de/src/StatusHandler.js#L249

Some of our installations do not have deviceTokens, we need to make sure that doesn't happen often, and that we have a way to fix it for our users in that situation.

Not sure what's to fix there, ultimately deleting the device tokens is just a way to ensure the delivery rates are close to 1.0 in your _PushStatus. Which in the end doesn't really matter.

Seems like this issue happens with the existing solution too. Eg what if the installation is reachable, but you erased the device token.

I'm not saying the current solution is perfect, nor can't be improved. but in both cases, what you want is just tracking the errors in the _PushStatus object. THat you can use the _PushStatus object to update your installations to add an 'unreachable' flag periodically is your own prerogative. So just tracking the errors/responses into the _PushStatus would be probably OK.

alexblack commented 6 years ago

Not sure what's to fix there, ultimately deleting the device tokens is just a way to ensure the delivery rates are close to 1.0 in your _PushStatus. Which in the end doesn't really matter.

If device tokens are erased, then we lose the ability to distinguish two different failure modes:

  1. The device was reachable, its not not
  2. The device failed to get a device token.

Distinguishing these two cases is important for us.

flovilmart commented 6 years ago

This is why ultimately because we have two different goals, we need to settle on a common ground which is simply to track the invalid deviceTokens. Which is done today through a log line, and could be put into the _PushStatus object.

alexblack commented 6 years ago

So just tracking the errors/responses into the _PushStatus would be probably OK.

If errors/responses are added to _PushStatus in a map keyed by installationId, that'd be super helpful!

alexblack commented 6 years ago

This is why ultimately because we have two different goals, we need to settle on a common ground which is simply to track the invalid deviceTokens.

I'm simply responding to your question What would be the benefit over removing the deviceToken?

If helping support my goal here doesn't fit in with the project's aims or priorities, fair enough.

flovilmart commented 6 years ago

If errors/responses are added to _PushStatus in a map keyed by installationId, that'd be super helpful!

Go ahead and open a PR, with the proper tests.

alexblack commented 6 years ago

I tried before (along slightly different lines), I frankly sucked at it, so for now I'm not going to try again. Sorry :(

flovilmart commented 6 years ago

That was really a waste of my time... thanks a lot

alexblack commented 6 years ago

I'd love for us to contribute somehow.

  1. Is there a way we could contribute financially to the parse-server project on an ongoing basis, perhaps by paying for an existing (or additional) cloud service of some type?

  2. I think we can get a PR done, this would be a good idea for one of the guys on my team to help out with, instead of us integrating directly with Google/Apple.

Today a _PushStatus looks like this:

    {
      "objectId": "vT77c6QI3W",
      "pushTime": "2018-05-14T14:44:09.484Z",
      "query": "{\"user\":{\"$inQuery\":{\"where\":{\"account\":{\"__type\":\"Pointer\",\"className\":\"Account\",\"objectId\":\"Qq6tfNsUoR\"}},\"className\":\"_User\"}},\"objectId\":{\"$ne\":\"Y1jVkY4LZp\"},\"deviceType\":{\"$ne\":\"web\"},\"lastSeenAt\":{\"$gt\":{\"__type\":\"Date\",\"iso\":\"2018-04-14T14:44:09.444Z\"}}}",
      "payload": "{\"pushType\":4,\"pushVer\":1,\"fromInstallation\":\"Y1jVkY4LZp\",\"timestamp\":1526309053323,\"aps\":{\"content-available\":1},\"isId\":\"4299c240-5785-11e8-9be6-a7fdcaf3dcac\"}",
      "source": "rest",
      "status": "succeeded",
      "numSent": 0,
      "pushHash": "d41d8cd98f00b204e9800998ecf8427e",
      "createdAt": "2018-05-14T14:44:09.484Z",
      "updatedAt": "2018-05-14T14:44:09.770Z",
      "numFailed": 1,
      "failedPerType": {
        "android": 1
      },
      "ACL": {

      }
    }

Would the idea be to add a field to it like this?

    {
      "objectId": "vT77c6QI3W",
      "pushTime": "2018-05-14T14:44:09.484Z",
      "resultsByInstallationId": {
        "VvNFL12st4": { 
          succeeded: true, 
          result: {
            "message_id": "0:1526665330066787%b4783826f9fd7ecd"
          },
        "LBsWhDeAMn": { 
          succeeded: false, 
          result: {
            "error": "NotRegistered"
          }
       }        
   }
flovilmart commented 6 years ago

I don’t believe there’s the installationId back from the push adapter, only the deviceToken