Closed sprabs closed 8 years ago
@flovilmart Is this something you've come across/do you have any suggestions? You were super helpful last time! :)
This seems related, too, but I don't see a resolution: https://github.com/ParsePlatform/parse-server/issues/1591
The _PushStatus collection should not cause any issue anymore as it should not be part of the schema. This seems to be a bug.
Working with latest version 2.2.16 - the _PushStatus is created in the Schema and break the server on parse.com side.
@flovilmart I checked the commit you added to fix the issue in April. I noticed that in the latest code, part of that was lost. I have done a point fix on my side and am still testing. It's just a single line in src/Controllers/DatabaseController.js from: https://github.com/ParsePlatform/parse-server/pull/1636
Would incorporating that change back in solve the _PushStatus problem or does more need to be done?
@sprabs I'm not sure what you're referring to. If you want to open a pull request, go ahead.
@flovilmart This is what I mean:
You changed this code https://github.com/ParsePlatform/parse-server/pull/1636 to fix the _PushStatus issue back on April 25th, correct?
If you look at the latest code on master, all your changes have persisted except the one you made in src/Controllers/DatabaseController.js. (Why did that get lost in the first place by the way?) In my local repo, I added that change back to see if that fixes the issue. Do you know if the loss of that change is why the issue has returned? (By that I mean, the fact that that one line change seems to have been blown away?)
The fix for the _PushStatus class was to make it so it's not stored in the _SCHEMA. Maybe that fix was accidentally reverted.
How can we verify this?
check https://github.com/ParsePlatform/parse-server/blame/master/src/Controllers/DatabaseController.js also, I'm still not sure that the change was reverted.
Look at line 98 in the current /src/Controllers/DatabaseController.js that's checked in to master.
previous line: .then(() => schemaController.getOneSchema(className, true))
was replaced with (current code): .then(schemaController => schemaController.getOneSchema(className))
When I reverted this change to the version that is claimed to have worked, I don't seem to be seeing the _PushStatus schema issue anymore. Are there any side effects for this change? Do we know why someone changed it in the first place? Was it an oversight or by design?
Given the blame info, @drew-gross may be able to give more info.
@flovilmart Looks like even with my code change, the issue happened again. There's definitely a bug with src/Controllers/DatabaseController.js
To reproduce this issue:
Send several push notifications (we've sometimes triggered it by just sending 5!) from an iOS or Android client that is pointing to a parse-server endpoint (in our case we have it hosted on an AWS Elastic Beanstalk, but that should not matter). It also doesn't matter who the recipient of the push notification is (it could be another client pointing to parse.com or to parse server).
The _PushStatus collection's schema then gets added to the mLab _SCHEMA collection. At that point, clients that are pointing to parse.com can no longer trigger push notifications (in our case, through a Cloud Code method). However, they can receive push notifications from clients that are pointing to the parse server endpoint.
The deadline is coming, and this is a pretty serious issue for the period of time when we are migrating our clients off of parse.com... Any suggestions for how to fix this? @drew-gross @hramos
Does anyone have updates or suggestions for how to proceed? This affects everyone migrating IMO.
@drew-gross @MBDeveloper @flovilmart @hramos
@sprabs is this bug only affecting dashboard.parse.com?
@flovilmart No, it's breaking any push notifications sent through clients still point to parse.com as well. That's the main issue for us. It's caused by _PushStatus schema being created. Here's the client error: [com.parse.ParseRequest$ParseRequestException: Invalid classname: PushStatus, classnames can only have alphanumeric characters and , and must start with an alpha character while sending push notification.]
So basically, we can't migrate until it's fixed because it will take a few weeks for all clients to migrate. We can't have the clients on parse-server breaking push notifications for clients on parse.com every 10-15 minutes...
You can use the parse-dashboard in the meantime no? If this is a really pressing issue for you I encourage you to try to put a PR together and work on a fix that would benefit everyone.
The main issue isn't just the dashboard (as stated above). It's that it breaks push notifications for clients still using parse.com. We extensively tested this. I'll re-post the findings:
Basically, once a set of push notifications are triggered from any client or group of clients pointing to parse-server, at some point the _PushStatus schema gets added. At that point, two things happen: parse.com dashboard breaks AND any clients still using parse.com will error when attempting to send a push notification (in our case through a cloud method).
@flovilmart :/ I've spent time looking at it, but am not sure where to start. Unfortunately, I'm not familiar enough with the parse-server code to add a fix. I was hoping you could help me or at least point me in the right direction since you've worked on this part of the code base before.
@sprabs I can't guarantee you that this particular fix will be done in the next days... I have other priorities of my own.
@flovilmart I understand. Could you point me to where in the code to start? I could use a hint.
I'm not sure either why this would be stored in the _Schema
I just added another part of a test on the current master branch, and it don't seem _PushStatus gets saved part of the _Schema... https://github.com/ParsePlatform/parse-server/pull/2300
also, can you reproduce the error on a test server while running with VERBOSE=1 ? I'm curious if some _PushStatus object isn't leaked back from your cloud code, back to the client.
We have VERBOSE=1 and I am checking the logs. I've just added your updated test, too.
One thing we did notice that it's very easily triggered when we do 5 or more quick actions from a single client in succession (from a device that's pointing to our own hosted server). I don't know if that's a hint.
I am seeing a few interesting logs (based on console statements I added in the code) around the time when the failure happens:
[36mverbose[39m: sent push! 0 success, 0 failures getOneSchema called for _Installation with allowVolatileClasses = false getOneSchema called for _PushStatus with allowVolatileClasses = false getOneSchema called for _PushStatus with allowVolatileClasses = false [32minfo[39m: warning: error while sending push code=101, message=Object not found. getOneSchema called for _Installation with allowVolatileClasses = true
Is there ever a situation where allowVolatileClasses should be true?
I actually see this, too: getOneSchema called for _PushStatus with allowVolatileClasses = true
I think this is what does it.
I believe you point in the right direction.
Should there ever be a situation when allowVolatileClasses = true?
I see that this is set in line 481 of DatabaseController.js
the problem is not that, the problem is that we let writes happening on the _SCHEMA with a volatile classes.
That makes sense. This time around, it got triggered again. I added a stack trace and more logs to help debug further. Here's what I see...
Error (I forcibly created a stack trace): create called within DatabaseController at DatabaseController.create (/var/app/current/node_modules/parse-server/lib/Controllers/DatabaseController.js:584:11) at Object.setInitial (/var/app/current/node_modules/parse-server/lib/pushStatusHandler.js:57:21) at /var/app/current/node_modules/parse-server/lib/Controllers/PushController.js:121:27 at run (/var/app/current/node_modules/core-js/modules/es6.promise.js:89:22) at /var/app/current/node_modules/core-js/modules/es6.promise.js:102:28 at flush (/var/app/current/node_modules/core-js/modules/_microtask.js:18:9) at nextTickCallbackWith0Args (node.js:420:9) at process._tickDomainCallback (node.js:390:13) upsertSchema was called for: _Installation getOneSchema called for _PushStatus with allowVolatileClasses = true
I see this (line 58 in parse-server/lib/pushStatusHandler.js) which looks suspicious to me:
return database.create(PUSH_STATUS_COLLECTION, object).then(function () {
58 pushStatus = {
59 objectId: objectId
60 };
61 return Promise.resolve(pushStatus);
62 });
getOneSchema don't write to the DB...
I added a stack trace to see when create is called within DatabaseController because I suspected that was the issue. And that is the issue. Just need to prevent that from happening.
you mean prevent return database.create(PUSH_STATUS_COLLECTION, object).then(function () {
??
Okay, so... I added one fix in the getOneSchema method (I've bolded it) in SchemaController.js that seems to have solved it, but I'm not sure if it could cause other side effects:
getOneSchema(className, allowVolatileClasses = false) { if ((allowVolatileClasses && volatileClasses.indexOf(className) > -1) || className == '_PushStatus') { return Promise.resolve(this.data[className]); } return this._dbAdapter.getClass(className) .then(injectDefaultSchema) }
I've basically gone through the code and it never seems that a default schema should be injected for the _PushStatus collection, so I thought this would guarantee that.
Even with this change, though, I do see the following error in the logs: warning: error while sending push code=101, message=Object not found.
Usually, the schema creation for _PushStatus was triggered after such an error. I don't know if that root problem still needs to be corrected or if that's a transient error.
the point of the allowVolatileClasses
and volatileClasses
array is to prevent them to be saved.
I'm not sure why getOneSchema
would write a schema to the database though. The problem has to be somewhere else.
@sprabs instead of trying to hack the code together, I believe you should try to write a test that would make this fail. That would greatly help instead of shooting in the dark.
@sprabs See that test here: https://github.com/ParsePlatform/parse-server/pull/2300/commits/65924a2baaf75ab1f6d090c726acdf5ab35a73e2
It doesn't seem that getOneSchema has side effects called or not with allowVolatileClasses
Hmmm... on our side, we are certain that that fixed the creation of the schema to our mLab database.
I am, however, seeing that there is still this periodic error: warning: error while sending push code=101, message=Object not found.
We are still trying to understand why that happens. The push still seems to send to the end user (so we are not sure if a retry is perhaps getting triggered on the server side). There's no obvious problem that is noticeable on the client side as the pushes are being received as expected.
Is it possible that the schema gets created on a push failure?
Again, this is too hackish to make it into a PR IMHO, as there is no way to test that this behaviour will last.
Also, https://github.com/ParsePlatform/parse-server/blob/master/spec/helper.js#L160
After each test we ensure that no class gets added to the _SCHEMA object in the DB,
// Other system classes will break Parse.com, so make sure that we don't save anything to _SCHEMA that will
// break it.
I'm not sure why in your particular case, this _PushStatus added to the _SCHEMA, as all tests, and behaviours lead to another conclusion.
What I suggest is that you try to make a test case that will break this behaviour once and for all.
Our issue, though temporarily fixed through the hack, seems to correlate with this error: warning: error while sending push code=101, message=Object not found.
Any idea what triggers this? It's coming from the pushStatusHandler.js
This error may be triggered because your hack is breaking some functionalities...
I can't know for sure why that would break, did you run the unit tests with your hack in there?
I'm gonna close that issue as it's unlikely we're gonna find a solution at that point, feel free to open a PR when you find the root cause of your problem as you can see, we have many tests covering your issue and we are not able to reproduce it.
@flovilmart That error is not caused by the hack. That error was there from day 1 using the existing code available on master.
@sprabs Like I suggested many times, can you please write a test that exemplifies the problem, and open a pull request with it? I've tried, with the PR that I opened to do the same without success. For now, you only provide excerpts from logs, no full logs, no steps to reproduce, and no guarantee to reproduce the bug(s) that you have.
You mentioned a problem on the _SCHEMA and now code=101, message=Object not found.
That doesn't help in any way.
I am personally not affected by that issue, and our setup is serving more that 200 requests per seconds and delivering push many times a day.
@flovilmart I still haven't quite figured out how to create the test for this, but I think we noticed one hint today.
We followed the stack and added a bunch of logs in the MongoStorageAdapter.js for findOneAndUpdate, upsertOneObject, updateObjectsByQuery after figuring out where the failure was happening. Through those logs as well as others we have added around the push notifications flow, we saw that a push notification is initially logged as "succeeded" and then right after that, we see that one with the same objectId goes to a "pending" state.
After that point, we see the error: warning: error while sending push code=101, message=Object not found.
We are suspecting that this schema creation is happening in MongoCollection.js (possibly through _adaptiveCollection) after it hits the error. Do you have any ideas how to best create a test to prove this?
When we hit that error, the dashboard issue happens (in which case parse.com is no longer accessible for that app and iOS/Android clients pointing to parse.com can no longer send push notifications).
It appears to be a timing issue for push notifications with respect to status update. Here are some logs that consistently happen for errors. As you can see below, in the failure case, the status update for the push notification is happening in the wrong sequence (running -> succeeded and then pending -> running). In the normal case, it is happening from pending -> running and then running -> succeeded. Are you familiar with this issue? What is your recommendation for implementing a solution?
FAILURE CASE:
findOne className[_PushStatus] schema[{"fields":{}}] query[{"status":"running","objectId":"YvioEvH1kQ"}] update[{"status":"succeeded","updatedAt":"2016-07-21T01:45:17.400Z","numSent":0,"numFailed":0}] findOne className[_PushStatus] schema[{"fields":{}}] query[{"status":"pending","objectId":"YvioEvH1kQ"}] update[{"status":"running","updatedAt":"2016-07-21T01:45:17.400Z"}] findOne className[_PushStatus] schema[{"fields":{}}] query[{"objectId":"YvioEvH1kQ"}] update[{"errorMessage":"{\"code\":101,\"message\":\"Object not found.\"}","status":"failed","updatedAt":"2016-07-21T01:45:17.454Z"}]
NORMAL CASE:
findOne className[_PushStatus] schema[{"fields":{}}] query[{"status":"pending","objectId":"jgZ2f047IX"}] update[{"status":"running","updatedAt":"2016-07-21T01:45:17.794Z"}] findOne className[_PushStatus] schema[{"fields":{}}] query[{"status":"running","objectId":"jgZ2f047IX"}] update[{"status":"succeeded","updatedAt":"2016-07-21T01:45:17.883Z","numSent":3,"numFailed":0,"sentPerType":{"ios":1,"android":2}}]
@sprabs @flovilmart , I'm not sure if this would give a clue as to the cause of this issue, but this code I'm using to send Push notifications always causes _PushStatus to be written in _SCHEMA:
function pushToUsernames(usernames, data) {
var userQuery = new Parse.Query(Parse.User);
userQuery.containedIn("username",usernames);
userQuery.find({useMasterKey:true}).then(function(users) {
console.log("Found " + users.length + " users");
var pushQuery = new Parse.Query(Parse.Installation);
pushQuery.containedIn("username", usernames);
Parse.Push.send({
where: pushQuery,
data: data,
}, {
success: function() {
console.log("pushToUsernames: Push sent successfully to " + usernames );
},
error: function(error) {
console.log("Failed to send push : " + error.message);
},
useMasterKey: true
});
});
}
@sprabs are you able to test with that patch: https://github.com/ParsePlatform/parse-server/pull/2367 ?
I have the latest version of the parse-server code, running on AWS Elastic Beanstalk with an mLab database:
I am hitting some issues with push notifications between my clients. All the credentials were set up fine for iOS and Android and it was working great. Then, after some time, when I perform some client-to-client operations (for testing) in which one client sends push notifications to another client or group of clients through a Cloud Code method, I start to see some problems.
I get this error in Android Studio: failed with error: Invalid classname: PushStatus, classnames can only have alphanumeric characters and , and must start with an alpha character while sending push notification.
Subsequently, when I send any additional push notification through an Android device, it errors out (same error message). Also, when I then try to load the data browser (hosted on parse.com), I see that it fails.
After some debugging, the issue is "resolved" (until it happens the next time) when I remove the _PushStatus document in the _SCHEMA collection. This has happened three times now in the last 24 hours...and is obviously not something that would fly in production.
Is anyone else hitting this problem? How did you fix it? I think the problem is that _PushStatus is not a supported collection that should belong in the mLab schema.