crypto-analysis
A project wherein I aim to analyze cryptocurrencies in search of repeatable patterns, trends, relationships, and interesting features.
Powered by CoinGecko API: https://www.coingecko.com/en/api (using CoinGecko API wrapper pycoingecko https://github.com/man-c/pycoingecko)
Important: This project is not intended in any way to serve as financial advice. It is only a means of me practicing some analysis using data that I have collected myself.
This project is a public version of my own private personal project. As such, this version does not publish all of my data but rather a very small sample of it. Around 40 coins are removed from this version, and there is around ~7 months less of minutely data. I am willing to elaborate more on the private version at request.
Key features of this project:
- Collecting History Data - by running "collectHistoryData.py", crypto data can be collected from CoinGecko using various parameters. "main.py" takes advantage of this and runs data collection for a list of over 40 coins, saving them as .csv files. Some examples of these 24-hour csv files can be found in data/bitcoin. There are also a couple of csv files in here with hourly data over 90-day periods.
- Automated Daily Collection - Using CRON on my Raspberry Pi, "main.py" is automatically run at 8-hour intervals to collect 24 hours worth of minutely data. 8 hours may seem an odd interval if I'm collecting 24 hours worth of data each time, but this is a safeguard which allows my script to almost never miss any available data due to server outages from CoinGecko.
- Appending CSV Data - When a collection of 24-hour csv files has been acquired, the "append_csv_data.py" script can be run to efficiently append the data from all of these files into one merged csv file. This checks for a merge point with each new csv file, and as such does not write the same data more than once if there are overlaps (which there often are due to my collection interval). A sample merged file can be found in "WIP/append_csv_data/0000_merge_test.csv"
- Trend Histogram - Ironically, this ended up not actually technically being a histogram, but was fully intended to be - I just thought a standard line plot captured the important areas more effectively. Running "WIP/trend_histogram/make_trend_histograms.py" will generate a set of "bins" using intervals of 30 minutes in which daily price increases and decreases have their weight measured and added to the respective bin. A sample of the output image can be found in the root directory as "Trend_Plot.PNG". Areas where there is a large gap in between the green and red lines are of potential interest as they denote that the price has a more predictable behaviour at that timeframe.
- Mins and Maxes - By running "WIP/mins_and_maxes/find_mins_and_maxes.py", an image can be produced which shows all local minima and maxima for the given set of data. A sample of this can be found in the root directory as "local_mins_and_maxes.PNG". Additionally, global maxima/minima details are printed by the console.
- Machine Learning - This personal project started around when I started a data mining course, just out of personal interest. At this time, I was admittedly a bit clueless and made a terrible attempt at using sklearn to predict future trends. This attempt consisted of me making a classifier and trying to classify the trend into a value ranging from -10 (strong decrease in price) up to +10 (strong increase in price) with all integers in between. At the time I was incredibly frustrated with my low accuracy and poor results... So that's a bit embarrassing. I left this in WIP/classifiers as a good example of what NOT to do for a project like this, haha. I have since learned a lot more about machine learning and plan to revisit this at a later date to make a solution that is actually reasonable - definitely a regression one, or at the very least a classifier that doesn't have 20 classes.