Closed msbrogli closed 4 years ago
It's not completely clear for me what's the flow of information for the most common use cases and specially what needs to be computed on the fly and what's just a matter of fetching from db. For instance, I imagine wallet balance will be calculated by fetching from the db all addresses belonging to a wallet and summing the balance for each, for each token. Is it easy to do on a nosql database?
2 other things we debated on meetings but are not here:
One other thing that's not clear is the wallet initialization process, when a user starts a wallet for the first time (or if we removed this wallet's data from our service). We need a process to go through all txs and build the info for this wallet?
I think the design is good so far but the actions should have more details, e.g. I believe all the API's should have a short description of what would happen on it.
Some other things I think should change:
walletId
on the TxHistory (database), so we can get the history by walletId and not by address;/txhistory
GET endpoint we should have a pagination parameter.I followed @pedroferreira1's idea and added a description of what each API Endpoint should do, including details of which tables would be used in each case.
I had to create two tables: AddressTxHistory and WalletTxHistory. Is there a simpler way to do it?
In the /txhistory GET endpoint we should have a pagination parameter.
@pedroferreira1, it has through the skip and count parameters.
@pedroferreira1 @obiyankenobi @jansegre Can you review it again, please?
@pedroferreira1 @obiyankenobi @jansegre Can you review it again, please?
Looks good to me
Maintenance Tools For each table, we should have a verification tool that checks its consistency. This tool should also have a parameter to recalculate the records of a given table. These tools will be integrated in our monitoring and alert system.
I'm not sure I understand what this is.
Address (for all addresses, even if WalletId is null) WalletId (indexed, null) TokenId (indexed) Balance (indexed) Number of transactions UpdatedAt
Why is Balance
indexed?
WalletTxHistory (for all transactions) TokenId (indexed) WalletId (indexed) TxId Amount (positive: received / negative: sent) Timestamp CreatedAt
A tx usually involves more than one wallet (the sender's and recipient's wallets). In this case, would we have one entry for each wallet for each tx?
The API can be consumed both by HTTP and Websocket.
What do you mean by being consumed by Websocket? Something like REST over websocket?
Wallet must received a balance update message every new tx, so it can update its balance and tx history.
Is this the only message sent on the websocket? This would be a combination of the balance update + new transaction?
All requests must be signed by m/1/1 of the wallet's xpriv.
What are we signing? A string like the wallet id? Also, where will this authentication be sent on every request? I think the http header is a good place.
GET /wallets: Get status of a wallet
I suggest /wallet
.
List of transactions Format: { txId, timestamp, action, amount } Actions: sent, received, move
What's move
?
POST /txproposals: Create a new tx proposal Arguments: Input selection algorithm (optional) (e.g. best match, from highest, from lowest) Outputs [[addr1, amount1], [addr2, amount2], ...] Comment
What's the comment argument?
How would the wallet send a transaction? Also through the wallet service?
I missed this:
All requests must be signed by m/1/1 of the wallet's xpriv.
I haven't thought much about it, but this feels unusual. Does the BWS do this? I have two concerns:
Yeah, what is being signed should be discussed. I had mentioned signing a string like te wallet it, but I guess it doesn't make sense to have a fixed data being signed. Besides, GET requests don't have bodies, so there's no data to sign there.
As for the second point @jansegre , I'm not sure it's a problem. I don't think the hardware wallet needs to display every data that requires signing. For instance, in our Ledger implementation, we don't display the change output. The user just needs to trust that the wallet is not stealing his money, so the user has to trust the code running on the hardware wallet.
Events from Full Node New tx or block. Update tx or block.
One other thing I was thinking: we'll have to update the full node also, right? It'll have to send this events on AWS? It needs the correct AWS credentials, I suppose.
I had to create two tables: AddressTxHistory and WalletTxHistory. Is there a simpler way to do it?
I don't see where AddressTxHistory
is used. I think only the WalletTxHistory
solves our problem, isn't it? Or we will use AddressTxHistory
to get the address history (for search address on the explorer) and WalletTxHistory
to get the wallet history?
One other thing I was thinking: we'll have to update the full node also, right? It'll have to send this events on AWS? It needs the correct AWS credentials, I suppose.
I think you are right, if we use the SNS for this. We could also send this message on the ws from the full node.
What's move?
I believe it's when we transfer in the same wallet, just moving from an address to another.
A tx usually involves more than one wallet (the sender's and recipient's wallets). In this case, would we have one entry for each wallet for each tx?
Yes, I think we will have one for each. We need this to get the history for each wallet.
Maintenance Tools For each table, we should have a verification tool that checks its consistency. This tool should also have a parameter to recalculate the records of a given table. These tools will be integrated in our monitoring and alert system.
I'm not sure I understand what this is.
We have cached pieces of information in multiple tables. For instance, the Balance.balance is the sum of the Address.balance, which is the sum of the UTXOs. Imagine that we find a bug and some of them have not been correctly updated. So, we need some tools to recalculate everything for a given walletId, for instance.
Address (for all addresses, even if WalletId is null) WalletId (indexed, null) TokenId (indexed) Balance (indexed) Number of transactions UpdatedAt
Why is
Balance
indexed?
We can sort by Balance to know the richest addresses list.
WalletTxHistory (for all transactions) TokenId (indexed) WalletId (indexed) TxId Amount (positive: received / negative: sent) Timestamp CreatedAt
A tx usually involves more than one wallet (the sender's and recipient's wallets). In this case, would we have one entry for each wallet for each tx?
Exactly, because the amount would be different. I'll clarify it on the design.
The API can be consumed both by HTTP and Websocket.
What do you mean by being consumed by Websocket? Something like REST over websocket?
Exactly. There must a way to call these API endpoints from websocket. I can look for a protocol to do that over websocket.
Wallet must received a balance update message every new tx, so it can update its balance and tx history.
Is this the only message sent on the websocket? This would be a combination of the balance update + new transaction?
It is not the only message. I'll improve the text.
All requests must be signed by m/1/1 of the wallet's xpriv.
What are we signing? A string like the wallet id? Also, where will this authentication be sent on every request? I think the http header is a good place.
I'll improve the text.
GET /wallets: Get status of a wallet
I suggest
/wallet
.
Ok! Thanks!
List of transactions Format: { txId, timestamp, action, amount } Actions: sent, received, move
What's
move
?
It is the name used by BWS when you send a tx to yourself.
POST /txproposals: Create a new tx proposal Arguments: Input selection algorithm (optional) (e.g. best match, from highest, from lowest) Outputs [[addr1, amount1], [addr2, amount2], ...] Comment
What's the comment argument?
A free text field. It can be used to identify the txproposal.
How would the wallet send a transaction? Also through the wallet service?
Yes, there's a propagate
choice in the update txproposal API.
Events from Full Node New tx or block. Update tx or block.
One other thing I was thinking: we'll have to update the full node also, right? It'll have to send this events on AWS? It needs the correct AWS credentials, I suppose.
Yes, we will have to update the full node. There are two ways to go here: (i) full node sends the notifications to SNS, or (ii) full node have an external notification system (e.g. zeromq), and we will develop a piece to replicate the notifications into SNS.
I missed this:
All requests must be signed by m/1/1 of the wallet's xpriv.
I haven't thought much about it, but this feels unusual. Does the BWS do this? I have two concerns:
Yes, it does.
- The serialization specification we will have to make for signature to be consistent (signing a JSON can became a headache if we have protocol proxies (AWS API Gateway?) that may parse the JSON and end up reordering/reformatting stuff);
They use a RFC that describes how to sign any http request.
- How this will play along with hardware wallets, which I'd guess wouldn't be happy signing stuff away without making the user very aware of what is being signed, which would make this impractical to use if the user has to approve every single request (besides having to understand them too) or having the wallet sign any such message for a period of time, which could be a security concern, besides the wallet having to understand the messages to make sure it isn't signing stuff that must require user consent.
I think you're right. I'll give more thought to it.
Exactly. There must a way to call these API endpoints from websocket. I can look for a protocol to do that over websocket.
What's the goal of having the endpoints also available through websocket?
I had to create two tables: AddressTxHistory and WalletTxHistory. Is there a simpler way to do it?
I don't see where
AddressTxHistory
is used. I think only theWalletTxHistory
solves our problem, isn't it? Or we will useAddressTxHistory
to get the address history (for search address on the explorer) andWalletTxHistory
to get the wallet history?
It is there just for auditing. If we find a bug we can (easily) recalculate the values of other tables.
Exactly. There must a way to call these API endpoints from websocket. I can look for a protocol to do that over websocket.
What's the goal of having the endpoints also available through websocket?
I was replying when I noticed that it may not be necessary. The idea is that we will use a websocket to get notifications, so we can update the wallet when something changes. In this case, we would receive a notification and send an HTTP Request to get the new piece of information, right? My original idea is that we could authenticate after the websocket connection is open and then use the APIs without any authentication. I'll put this flow in the text, so I'll need to have a clearer understanding of it.
The idea is that we will use a websocket to get notifications, so we can update the wallet when something changes. In this case, we would receive a notification and send an HTTP Request to get the new piece of information, right?
I'm not sure about this. The way I see it, it's like our wallet nowadays. We do the initial http requests and then we basically receive the updates through websocket, so we don't need to do other http requests.
About using a NoSQL db, I'm not sure it's the best approach. Our data seems to be structured in a way that we could use SQL dbs. We know all fields already.
Also, there seems to be some limitations in DynamoDB for primary keys. It also needs a unique primary key, that can be of 2 formats:
(Obs: attributes are the equivalent of SQL columns in DynamoDB)
More info here: https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/HowItWorks.CoreComponents.html#HowItWorks.CoreComponents.PrimaryKey
Consider table WalletTxHistory
. It has the following attributes/columns:
The primary key needs to be unique. We can't use just (WalletId, TokenId), as several transactions in the same wallet can have the same TokenId. We have to use (WalletId, TokenId, TxId).
This is not supported by DynamoDB, as we can only use 2. There's a way around it, though. We can concatenate TokenId + WalletId, as we'll always query for the history of a wallet's given token. That way, our primary key would be ("WalletId-TokenId", TxId). Doing this concatenation here is not a problem as we'll always query for a WalletId AND TokenId, so we can easily construct it during query time.
Even though it's a hack, we can make it work. It's not so easy to do the same for table AddressTxHistory. I think it's missing attribute/column TokenId in the original design, so I added it here.
Primary key has to be composed by (Address, TokenId, TxId). We cannot use just (Address, TxId), as one tx can have multiple tokens to/from the same address. We need to be able to query this table by Address, so we could have our key as (Address, "TokenId-TxId"). It works, but when fetching the results we'd need to separate TokenId and TxId again from "TokenId-TxId". Or we duplicate the attributes and have everything in the table ("TokenId-TxId", TokenId, TxId).
EDIT: Now that I've written it and have come up with ways to go around the "limitations", it doesn't seem so bad. Even Amazon recommends concatenating information when creating keys ("country-state-city" in this example). Maybe it just feels like a hack because I'm used to the SQL world.
I guess the main discussion we should have is wether there's an actual advantage in using NoSQL. Our data seems to be structured, so SQL could handle it. The advantage DynamoDB seems to be related to the pricing model, as we're only charged for read/writes.
There are other good reasons to choose SQL:
IOW, I'm very pro using SQL (on whatever works best with RDS, maybe Aurora)
I was reading a bit to try to understand why BWS uses nosql but I couldn't find a reason behind it. I agree that we currently have no clear reason to use nosql and should go with SQL.
I just would like to know if @msbrogli has read anything that could explain their choice.
I'm not sure about this. The way I see it, it's like our wallet nowadays. We do the initial http requests and then we basically receive the updates through websocket, so we don't need to do other http requests.
I like this approach but we just need to make it clear what to do when we lose connection on the websocket. The wallet reloads the full data, should we do the same here? Process all the information again?
What's the safest approach here?
Regarding the events that the full node will have to emit, we might want a different approach than what we currently have.
Right now we send an event (on the websocket) with the new tx and other events with the txs on the inputs, which are spent. We could send just one event with the tx and all input txs so it could all be processed at once.
This could prevent that small "bug" with the balance, in which it usually increases and then decreases when sending txs to the same wallet. Also, it could be better to invoke just one lambda function per new tx than one for every new tx and its inputs.
As I'm implementing, this last comment of mine doesn't seem relevant. At least for new transactions, we have all information we need, as the inputs and outputs have a decoded field, with all the information necessary.
There's one other thing that I'm in doubt during the implementation. When a new transaction arrives, how do we know if it belongs to an already started wallet?
Imagine a have a started wallet (walletId = myWalletId
) with add1
and addr2
. For now, I only received1 transaction on addr1
with 22 HTR. So the Addresses table will have:
| Address | WalletId | TokenId | Balance | # of txs |
|---------|------------|---------|---------|----------|
| addr1 | myWalletId | 00 | 22 | 1
A new transaction arrives that sends some tokens to addr2
. How can I associate this address with myWalletId
?
At first, I thought that, during wallet creation, we'd also add the unused addresses to this Address table. However, as it also includes the token id, it seems strange for me to do that. We'd add the token id as null:
| Address | WalletId | TokenId | Balance | # of txs |
|---------|------------|---------|---------|----------|
| addr2 | myWalletId | NULL | 0 | 0 |
Maybe there's something I'm missing.
@obiyankenobi We must add the unused addresses to the Address table as well. In this case, it would be a placeholder that must be replaced when a real transaction arrives. As for the token id, I'd use the HTR token id instead of null.
@obiyankenobi We must add the unused addresses to the Address table as well. In this case, it would be a placeholder that must be replaced when a real transaction arrives. As for the token id, I'd use the HTR token id instead of null.
What's the reason we need the token id and balance on the Address table? Get the richest addresses list?
@obiyankenobi We must add the unused addresses to the Address table as well. In this case, it would be a placeholder that must be replaced when a real transaction arrives. As for the token id, I'd use the HTR token id instead of null.
What's the reason we need the token id and balance on the Address table? Get the richest addresses list?
I think there are several uses:
1) Richest list, as you said. 2) Debug the wallet's balance if it is incorrect for any reason. After all, the wallet's balance is equal to the sum of its addresses. 3) Show the list of addresses to the user (this is currently available only in Hathor Desktop Wallet).
One thing about the Address
table and GET /addresses
API.
On the table, the addresses are separated by tokenId. Imagine we had 2 transactions in total. The first just had (among the outputs) addr1 and token1. The second had addr1, token1 and addr1,token2. The table will look like this (I omitted some columns):
| Address | TokenId | # of txs | WalletId |
|---------|---------|----------|----------|
| addr1 | token1 | 2 | id1 |
| addr1 | token2 | 1 | id1 |
When we fetch the addresses for wallet with id1, how would we calculate the # of transactions for addr1? It's definitely not the sum. We could get this information from AddressTxHistory
table, maybe during the API call itself (not have it pre-computed). But # of transactions
on the Address
table seems useless.
One other thing we might have is both an Address
and AddressBalance
tables.
AddressBalance
Address
I'll go with the suggestion I gave on the last comment for addresses.
We need to store # of txs
for each address in order to know if it's been used or not. We cannot use just the balance, as the address might've been used but happens to have balance 0.
# of txs
is also used on the explorer, when we show the # of txs
for the address for each token.
One thing I noticed from the design is that it's missing an address
on the Address
table. Probably you've already added it.
And I agree the best approach would be to get the tx count from the AddressTxHistory
table during the API call.
And I agree the best approach would be to get the tx count from the AddressTxHistory table during the API call.
I'll actually do as I suggested, so there should be a field on the database for the # of txs and we won't need to calculate it.
The reason for this is that we need a way to know if an address has been used or not. Just having the balance doesn't help, as the address might have been used but happens to have balance 0.
The reason for this is that we need a way to know if an address has been used or not. Just having the balance doesn't help, as the address might have been used but happens to have balance 0.
This could be easily done checking if there is any data on AddressTxHistory
for this address.
But it's okay, I am good to go with your solution. You just need to add # of txs
field also on the AddressBalance
table for the reason I mentioned if you go with it.
This could be easily done checking if there is any data on AddressTxHistory for this address.
That's true. But I think I'll move with 2 separate tables to have it pre-computed (and can also store the address derivation path on the Address table).
You just need to add # of txs field also on the AddressBalance table for the reason I mentioned if you go with it.
Makes sense, I hadn't thought of that. Thanks.
Ok, so for the locked/unlocked balances, this is what I thought.
The AddressBalance table would require locked/unlocked balances:
| address_balance |
|-------------------|
| address |
| token_id |
| locked_balance |
| unlocked_balance |
| n_of_transactions |
The same for WalletBalance:
| wallet_balance |
|-------------------|
| wallet_id |
| token_id |
| locked_balance |
| unlocked_balance |
| n_of_transactions |
The Utxo table already has a timelock
column:
| utxo |
|----------|
| tx_id |
| index |
| token_id |
| value |
| timelock |
I see 2 possibilities here. We can:
locked
column that will indicate whether the utxo is locked or not. So if an utxo has a timelock in the past, the timelock column will still be set but the locked
column will be null (or false).I believe there are 2 "external" factors we should also consider: block height lock and tx-proposals. Both lock a tx, but not with a specific timestamp. Using idea 1, we could set the timelock column to MAX_INT when a block is not yet unlocked or an utxo is part of tx-proposal. This way they'll be considered locked. Later, when they are unlocked, it should be set to null.
To be honest, I haven't seen any advantages or disadvantages in either. Re-using the timelock column should do the trick, but adding an extra column preserves the utxo timelock information (not useful now, but might be useful in the future, even though it's easy to get it directly from the tx as we have the tx_id and index).
What do you think?
One other possibility is doing like proposal 1 above, but never deleting the timelock info from the table. When we want to create a tx proposal, we query the table and filter by timelock <= some_date
. This way we get only those that are unlocked.
For blocks, since they are unlocked by height, we could set the timelock to MAX_INT and later remove it.
I wonder how useful is the timelock information, I'd guess it is somewhat useful so I'd say proposal 1 with timelock <= some_date
on queries seems the best of both worlds (although performance-wise, it really isn't obvious which is best, and I think this is fine). As for blocks, couldn't we just add a heightlock
and use heightlock <= some_height
on queries? These even work well with NOT_NULL because a value of 0 can be used on both cases to indicate no lock. I don't really see the problem of adding a column.
As for blocks, couldn't we just add a heightlock and use heightlock <= some_height on queries? These even work well with NOT_NULL because a value of 0 can be used on both cases to indicate no lock. I don't really see the problem of adding a column.
I like this idea. We'd also always keep the information in the tables (no need to set to null), which may be even better when dealing with conflicts, as the best block height might change. I'll follow it.
Did some research on how we could deal with timelocks. It's a bit more complex than heightlock. The latter is easy to handle because a new block unlocks every time a new one arrives. We don't need to schedule anything.
For timelocks, these are some suggestions:
What do you think? I'm inclined to go with 1. Seems like a reasonable precision.
The problem is not entirely clear to me yet. Let me see if I understand.
These affects at most 3 tables address_balance
, wallet_balance
and utxo
, so I'll only focus on them:
utxo
can be simplified a lot by using timelock
and heightlock
that way row operations should basically be insert
and delete
, and those are made only made for every block update, which seems great.address_balance
and wallet_balance
have the columns locked_balance
and unlocked_balance
which potentially depend on both the current time and the current block height. Which is what we're trying to deal with.Ideally locked_balance
and unlocked_balance
would behave "calculated fields" (like properties in python), but that is inefficient, that's probably the reason these fields exist in the first place. So they are basically a cache.
All the proposals 1 to 4, focus on how to update every one of these cached values as reasonably soon as possible. But I don't know if this is necessarily great. I'll add a proposal 5. A "lazy evaluation", add locks_expire_height
and locks_expire_timestamp
columns on both address_balance
and wallet_balance
, and when only when querying wallet_balance
, if the balance is invalid (because then querying we have the latest timestamp and height) we update the locked/unlocked balances of the wallet and its addresses. I don't imagine this being more complex than configuring a trigger system to update all balances, but please do correct me on this one.
If we have to go with one of these proposals, I'd go with 1. because it doesn't seem to need any extra configuration, we already have a stream of block events. But the trade offs I see are: a) having a slightly outdated balance lock (~30s) vs having to wait for a balance update query (probably a fast query); b) one query to update all balances, even when they are not being used vs multiple queries that only update the balances there are being requested.
It all depends on how much processing we wan't to do when there's an api call. Let's use your proposal and imagine someone queries the wallet balance, which has a locks_expire_timestamp
set. From just that information, we don't know which utxo is now unlocked. We'd have to find out which addresses belong to this wallet and then query the utxo table to find out the unlocked utxo(s) (or just a sql join). Anyway, seems like the kind of thing we were trying to avoid by calculating everything beforehand.
@jansegre not sure I got your idea right or I'm not seeing an easier way to implement it.
Wait, shouldn't UTXOs have an address
field? It would be possible if tx_id
and index
was enough to find the address, but I think it would be useful to have the address on the utxo
table. I know it's one more field, but this seems like a useful one, no? The rest I assume is there already (there should be a way to find all UTXOs that make up a wallet through tx_id
and a transaction-wallet table (which I imagine we have, because we have to list all transactions of a wallet). Or if we add a utxo_wallet
or addrerss_wallet
index pair table.
We do have the address on the utxo table. As I said, we'd have to query:
We'd have to find out which addresses belong to this wallet and then query the utxo table to find out the unlocked utxo(s) (or just a sql join).
But I think this design tries to prevent as much as possible these on-demand calculations and prefers to have pre-computed values.
Also, we might have more than one utxo with timelock. In that case maybe we'd need a list of locks_expire_timestamp
. To be honest, I'm still leaning a bit more towards the other solutions.
I think @jansegre's idea seems better, it doesn't add much more processing (even though it adds more complexity to the code). Updating the address_balance
is easy and if we add a wallet-id
column to the utxo table it's also easy to update wallet_balance
balance.
The good thing is that we will always have the correct balance, even if there is a bug in our mining and we end up having some hours without blocks.
The bad thing is that we need to do this check (see if there ia a balance to be updated) every time someone requests the balance (on the wallet, or on the explorer to get an address balance, or the richest addresses). However this check would be needed every time a block arrives in the first option, so I don't see that as something really bad.
We do have the address on the utxo table. As I said, we'd have to query:
We'd have to find out which addresses belong to this wallet and then query the utxo table to find out the unlocked utxo(s) (or just a sql join).
I just read DATABASE.md
and it looks like we do have everything we need. I misunderstood that we needed more info to make this query possible. And it seems to me that the whole update could be made with one or two queries, which admittedly aren't trivial, but probably won't have more than 2 joins each, and would have to visit all utxos of the addresses needing an update. Which even in the range of 10k seem to me like it should be a nearly instant query (at least with the right indexes) for MySQL.
Then again, similarly could be said for a query that updates all expired locks globally, but that would scale much worse, for the benefit of always having a balance at hand that will be outdated sometimes. If we accept that the balance can be outdated sometimes, we could return the current expired balance before waiting for the update query to end.
But I think this design tries to prevent as much as possible these on-demand calculations and prefers to have pre-computed values.
I agree with the sentiment, but I don't fully agree "as much as possible", there are other trade-offs to consider. This query would most likely have to be run the first time a miner (assuming a miner because it's the most common wallet user with locks) opens their wallet and then every N seconds (in average), N being how long it takes for them to find a block (because that's how far apart heightlocks will usually be). And since I'm assuming this query won't take more than a few hundred milliseconds (if even that), it's a very fair price to pay for scaling better, always returning the correct up to date balance, and having a slightly less complicated deploy.
An orthogonal idea (that can probably be used along whatever we decide to do with locks and leans more towards pre-computed values) would be pre-computing a list of future "(unlocked, locked)" balances, that approximates (because it would have to somewhat synchronize heightlock+timelock) when the next unlocks will be and pre-computes the next N (for some N, or even until there aren't any locks, which shouldn't be a too high of a count) balances (and in the background update the forecasts and delete the past balances). When we receive a query we just use the balance that is valid at the timestamp the query was made. This could be a relevant if whatever solution we go for ends up having a bad performance.
Also, we might have more than one utxo with timelock. In that case maybe we'd need a list of
locks_expire_timestamp
.
That's what I imagined, and I think the same query would update all the balances needed. What I keep thinking is "how hard can it be to make this query?". I may very well be underestimating this, but if it's mostly matter of "this will make the wallet slow", I think we could make this query and benchmark it a little (or against updating all addresses, considering the UTXOs already on the mainnet) to have an idea of the performance impacts.
Just 2 things about this idea:
I'm not sure it's worth doing this for heightlock. We can already update it when a block arrives, because that's when the old block becomes unlocked. It's a pretty simple operation, so I don't feel there's any advantage in doing it differently. For timelock it's a different thing;
if we decide to go this route, it's important to also discuss the sync strategy between wallets and this service. For instance, the current wallet only queries information from the full node when it starts and then listen for the transactions on websocket. In this case, there would be no new queries to trigger this balance update. So if I open a wallet one minute before a timelock expires, I wouldn't make another query later and get the new unlocked balance;
2. if we decide to go this route, it's important to also discuss the sync strategy between wallets and this service. For instance, the current wallet only queries information from the full node when it starts and then listen for the transactions on websocket. In this case, there would be no new queries to trigger this balance update. So if I open a wallet one minute before a timelock expires, I wouldn't make another query later and get the new unlocked balance;
To me this is actually the best argument against it so far, and probably tips the balance in favor of your initial proposals (of some form of event that would trigger the update).
The only alternative I see that could possibly solve this without relying on time based events, is to move away from a simple (locked, unlocked) balance structure to something like [(0, <full balance>), (-1, <balance locked from height or "permanently locked"> (<ts1>, <balance unlocked after ts1>),(<ts2>, <balance unlocked after ts2>), ...]
, which the UI can use for exposing a (unlocked, locked) balance or even display when the next amount would be unlocked, but this is obviously more complicated and I'm not sure I like it that much.
I think we could review the balance structure when we include support for multisig. I suspect some sort of improvement of this structure might be inevitable. Also we should consider researching how BWS does this (and possibly other wallets too), it's probably a problem that has been solved many times in many different ways.
For now, I'm leaning into the proposal 1., piggy-backing on block events to get these "time-events" that would signal a possible unlock event.
I think that @jansegre's idea is better and simpler. To reduce some of the downsides pointed out by @obiyankenobi, we can run once a "general utxo update" task once a day (twice a day?) that will go through all utxos that have timelock <= now
(we need to properly index the db) and update their wallets.
Regarding the sync between wallet and the service, I don't like @jansegre's idea of storing multiples balances according to the timestamp. We can simply include a next_refresh
field that will be equal to the minimum timelock of the utxos of that wallet (or null). So, even if no transaction has been received for a while, the wallet will trigger the update. What do you think?
We can simply include a
next_refresh
field that will be equal to the minimum timelock of the utxos of that wallet (or null). So, even if no transaction has been received for a while, the wallet will trigger the update.
We'd have 1 next_refresh
value per wallet? How does the wallet get this value, in case it's been changed? We again fall on the case of having to debate the sync between wallet and this service, as this value needs to be updated.
We'd have 1
next_refresh
value per wallet? How does the wallet get this value, in case it's been changed? We again fall on the case of having to debate the sync between wallet and this service, as this value needs to be updated.
I think an easier way to see this is looking at the balance returned by the API as the following structure (which doesn't have to be the structure we use, but it would be functionally identical apart from a refactor):
balance: {
available: X,
locked: Y,
expires: T,
}
The wallet is then able to display the balance until the time reaches expires
, when it will have to make a "refresh" query to get the updated balance (including the new expires
).
This value represents the earliest timelock to expire, and it can't change unless a new tx/block for this wallet arrives, and then this is an event that should trigger the same update mechanism that a schedule would (which I assume is a solved problem, how to make this update arrive to the frontend).
The more I think about it, the more I get that this is just the functionality relevant <0.1% of uses that requires >30% of the design effort. By no means I think we should ignore this, but we definitely can postpone this and let time-locked balances be outdated sometimes, no?
Introduction
The
hathor-wallet-service
will be used as backend for all official Hathor's wallets.This design is highly inspired in the Bitcore Wallet Service.
Sync between full node and service is being designed here: PR https://github.com/HathorNetwork/hathor-wallet-service/issues/8.
TODO LIST
TEST LIST
Requirements
Architecture
The service will be implemented using AWS services.
Services
Tx Processor
Receive transactions/blocks from the full node and update the database. It will receive two types of messages: NEW_TX and UPDATE_TX. The latter is used to notify a change in the tx status (from executed to voided or otherwise).
It must keep the balance and transaction history of all addresses, even the ones not designated for a wallet. In the future, we may include an API to query the address and tx history for any address. We would also easily know the richest addresses.
When a new tx arrives or a tx is updated, it must update the indexes of all affected wallets.
When the full node restarts, it should sent all notifications again. If the status of the transaction is the same, we can safely skip it.
API Service
Load from database and return to the wallet. The idea is to do the least number of calculations here. All things should be pre-computed by the Tx Processor.
Maintenance Tools
For each table, we should have a verification tool that checks its consistency. This tool should also have a parameter to recalculate the records of a given table. These tools will be integrated in our monitoring and alert system.
Database
Flow from wallets
wallet_id = sha256d(pubkey)
).Events from Full Node
API for the Wallet
Features:
Authentication
All requests must be signed by
m/1/1
of the wallet's xpriv.POST
/wallet
: Create a new walletWhen the wallet is created, the following tasks must be executed:
After requesting the creation of the wallet, the client must polling to verify when it is ready. It may take a few seconds to be ready. If an error occurs, the user must have the choice to check status again or reset the wallet.
Arguments:
Returns:
{ walletId, status }
GET
/wallets
: Get status of a walletArguments:
Returns: All fields of the wallet, including its status.
GET
/addresses
: Get wallet's addressesWe can run one query in the Address table filtering by a given
walletId
.Open question: Should we paginate?
Returns:
m/1/1
,m/1/2
)GET
/balance
: Get wallet's balanceRun one query in the Balance table filtering by
walletId
(andtokenId
if the argument is passed).Arguments:
Returns:
If TokenId is received, it will return only the requested token.
{ tokenId, tokenName, tokenSymbol }
GET
/txhistory
: Get wallet's transaction historyArguments:
Returns:
{ txId, timestamp, action, amount }
GET
/tx
: Get transaction's detailArguments:
Returns:
GET
/txproposals
: List of wallet's tx proposalsArguments:
Returns:
POST
/txproposals
: Create a new tx proposalWhen a tx proposal is created, it should mark the used UTXO as locked. They do it by assigning a TxProposalId to them. So, next tx proposal will skip them avoiding double spending. We can optionally create a scheduled job to cancel all tx proposal opened for more than X minutes.
Arguments:
[[addr1, amount1], [addr2, amount2], ...]
Returns:
POST
/txproposals/update
: Update tx proposalArguments:
[ inputData0, inputData1, ...]
(optional)bool
(optional)Returns:
POST
/txproposal/close
: Cancel tx proposalWhen a tx proposal is closed, it should remove all locks from the UTXOs used by it.
Arguments:
Returns: