Open maltemoeser opened 5 years ago
Alternatively: add a command blocksci_parser reset
that empties all directories but the mempool folder, with an optional flag to also remove mempool data
That's a good idea. I think a separate directory that stores mempool data globally might be the better option. Then multiple parsed versions of a chain (eg. at different block heights) can use the same mempool directory, without having to keep multiple directories updated or running multiple mempool recorders.
However, both options should be fairly easy to implement.
Related idea: It might be helpful for (new) users to offer mempool data to download, as this is something that can't simply be re-created.
@martinplattnr going a step further, I think the current deep integration with BlockSci is not ideal, as it means that in order to record mempool timestamps you need to run a server that can run BlockSci 24/7, which is quite costly. Furthermore, the data format can't easily be parsed and may not even be reusable across different machines (#2). Ideally, there would be a lightweight client recording transaction timestamps, and then a tool that converts these into the optimized data format for BlockSci.
Yes, you are right, the current implementation is not ideal.
A simple Python script that connects to a Bitcoin node and logs <BlockHash, Timestamp>
and <TxHash, Timestamp>
pairs, and an importer, may be enough.
I added some thoughts related to this and #2 in #2 itself.