Open jiagengliu opened 5 years ago
It seems that the default choice of disk space is 500 GB. I wonder if that may not be enough for BlockSci?
You are right, 500 GB is probably not enough anymore for using BlockSci on Bitcoin. My fully synced Bitcoin node has 284GB of data, the parsed BlockSci data is additional ~200GB (I don't have a fully parsed Bitcoin chain available.), so with clustering you easily need more than 500GB I guess.
Regarding the AWS instance: I can't find default disk size settings for the r4.2xlarge instance. Isn't it defined by the user when creating the instance and its EBS volume?
Anyway, a note about the disk space requirements should be added to the docs.
I believe the default disk size is based on the disk size of when the AMI was created, I wasn't able to find a way to increase it for the existing AMI. I'll make sure to add a warning to the docs that a larger disk size should be chosen.
@jiagengliu you can increase the disk size using this guide
Added in 2a597a4fd3cafe080078b6c8ed6bb05d094bccb0
I'll leave this open for a while since it might affect other users too
Thank you @maltemoeser and @martinplattnr. Let me rephrase it: when creating your AMI from the EC2 image in the readme file, do NOT click the blue "Review and Launch" button right away. Instead, proceed with the configuration and change the size of the root volume to something above 700 GB.
@maltemoeser I didn't notice the issue until I failed to save my notebook. It may be also a good idea to warn the user when the parser is about to run out of disk space.
@jiagengliu having a warning would be nice indeed. However I doubt that many users will check the parser logs, so putting a warning there will probably largely go unnoticed.
Maybe it's also a good idea to hint users to check parser logs in the documentation as well. I didn't know about the log and have only checked the process monitor (top
) to get a sense of what's going on.
I've added a warning on v0.6
in the Python interface, that's probably how most users interact with BlockSci anyways. I've also started a list of useful warnings for the parser in #293.
Please can you guys update the AMI.
Does anyone have a rough idea about the cost of running the AMI and having it updated the bitcoin data directory? I am struggling with my local machine (it will take about 40 days to update) and I am searching for a plan B.
@trekianov Referring to #2, you could try starting an AMI and download the parsed data to your local machine to start analysis instead of maintaining a full node on your machine.
@maltemoeser I have a dumb follow-up question: once we are done with parsing, is it safe to delete the original blockchain database (usually ~/.bitcoin
)? Thank you!
@maltemoeser I have a dumb follow-up question: once we are done with parsing, is it safe to delete the original blockchain database (usually
~/.bitcoin
)? Thank you!
Hey once you have parsed the data into the blocksci format it's perfectly safe to delete the original .bitcoin folder. The blocksci analysis will still work as it uses its own format.
But when .bitcoin is deleted you won't be able to parse new blocks to update blocksci without the original .bitcoin folder.
Reproduction Steps
Using r4.2xlarge with the AMI supplied in https://citp.github.io/BlockSci/readme.html. After launching run the following script:
After one day, the root disk will be filled up,
df
returnsSystem Information
Using AMI: Yes BlockSci version: 0.5 Blockchain: Bitcoin Parser: Disk/RPC Total memory: 61 GB