Open GingerMoon opened 3 years ago
First: THANK YOU! It is too easy for us to use language shortcuts which can be confusing to anyone who does not understand them.
I will try to fix these in the document, when I can, but I hope the following helps.
Notification A message sent to an endpoint node
would be the same as saying
A Notification is a message sent to an endpoint node
This is "short hand" (or a quick form of style) that removes some repetition.
This definition states that "Notifications are the messages sent to an endpoint node (also known as a "Subscription update" in the RFC), which are stored in the messages database table.
It may help to understand that Autopush is actually two different programs. There is the "connection handler" (which, unfortunately, is called autopush
and does not help remove confusion) that maintains the long-lived websocket connection between the browser (also called the user-agent) and the Autopush service, and the "endpoint handler" (called autoendpoint
) which accepts messages from third party webpush providers and routes them to the appropriate service.
If it's any help what-so-ever, please consider how we speak of this service internally:
autopush
autoendpoint
.Unfortunately, due to legacy reasons, we can't change the name of the project.
(Skipping a few of the other questions because they may be related)
Is the primary key of the router table <uaid(partition key),node_id(sort key)> ? (Sorry to ask, where can I find the schema in the source?)
DynamoDB is a Key/Value database (sometimes called NoSQL) and does not have a traditional SQL Schema. Instead, you define a Primary Key and a Secondary Hash and use those to refer to arbitrary fields. Think of it like having data in a barrel you store in a warehouse. You know the aisle, and shelf a barrel is at, but anything else requires you to keep another index. Fortunately, you can store whatever you like in that barrel.
This means having a schema doesn't really make sense because you would have many, many optional fields.
As much as I don't like to say it, your best chance to understand what's going on is to look at the code
Table rotation was originally done because AWS did not offer a TTL (Time To Live) option, which would allow data to be automatically deleted once it had expired. Table rotation was complicated, annoying, and broke often. Table rotation is not used anymore, and we are very happy about that.
"chan list" was short hand for the "Channel List". This is the list of channels associated with the User Agent. The Channel List was one of those many records stored in the data barrel (see #3.(&4)
above). I'll also say that we were being "clever" by storing this in a DynamoDB record that had a Primary Key (the user's Agent ID or UAID), but no Secondary Hash (well, kind of. We used a single space character). This was where we store information about the user in general. I'll apologize, because "clever" often leads to "confusing", and that's undoubtedly the case here.
Connection nodes.
Those are the autopush
connection handlers. Each connection handler can only accept some number of connections. We have to run many of them to handle all the users of our system. When a user agent connects to a connection node, the connection node "registers" that user into the routing table. When an incoming message for a given UAID is accepted by the endpoint handler, the endpoint handler looks up how to route the message to the correct connection node by looking into the routing table.
Again, thank you for these questions. I will try to make the documents a bit clearer. The new Autopush-rs server works in the same way as this server and is what we currently use in production for connection handling. We're hoping to get the second part (the endpoint handler) running soon, but we need to test it out.
Thanks @jrconlin!
Going to reopen this, because I still need to make the documentation better. 😉
I tried hard and spent a lot of time on understanding the archetecture doc. but still failed to understand it. I would appreciate if anyone could help me and answer my questions about the document ( I am sorry if answers to these questions are quite obvious):
In Glossary
What does this sentense mean exactly?
"Autopush stores these in the message tables." According to doc,
So it's "autoendpoint (endpoint node)" , NOT "autopush (connection node)" which stores these notification in message tables.
2. Router Table Schema
Is the client here the same meaning as "user agent"?
Considering the concept below: autopush (connection node) autoendpoint (endpoint node)
So "Autopush" is confusing here. It's autoendpoint instead of autopush that clears the node_id record, right?
Is the primary key of the router table <uaid(partition key),node_id(sort key)> ? (Sorry to ask, where can I find the schema in the source?)
What does "secondary global index" here mean? How does "secondary global index" allow for maintenance scripts to locate and purge stale client records and messages?
Is it saying "If table rotation is NOT disabled"? How is "Message Table Rotation" implemented?
6.
In Rules for Endpoints
What does it mean? Where is the chan list coming from?
Don't quite understand "Rules for Connection Nodes"....