gtrebilcock / BitcoinEconometrics

0 stars 1 forks source link

Peer review - cw654 #6

Open cwang1113 opened 4 years ago

cwang1113 commented 4 years ago

Summary

This project aims to predict Bitcoin prices using various features such as gold prices, inflation rates, and GPU pricing. The data will be obtained from various sources such as goldprice.org and pcpartpicker.com. This project will be useful because bitcoin is a very volatile instrument, and it would be really interesting to see if the price is indeed "random."

Strengths

  1. The data seems to be able to be structured in a very tabular fashion, which will hopefully make it easy to run various models such as linear regression. The features also are very carefully chosen and intuitively makes sense why they should be considered.
  2. The citations of previous literature are very interesting, and I think it is very beneficial in determining the features to use. It might also help to build upon some of the pre-existing models from literature.
  3. I think the explanation of why current models are not applicable is very important, as it props the question of "why not?" This project would hopefully bring some insight into how the market for bitcoin might differ from more traditional markets.

Improvements

  1. It might be a little complicated to gather data from different sources and combine them. There may be data missing from various sources, which will lead to a lot of data cleaning. Furthermore, it may lead to a lot of missing data if you end up using a lot of previous days' worth of data to try to predict the next day.
  2. Because there are so many features, it might help to do some sort of correlation test in the beginning to try to pre-determine which features will actually be useful. For example, I am not exactly certain whether or not NVIDIA stock data will be entirely helpful.
  3. How do you plan on including time trends? For example, last year there was a huge explosion of bitcoin largely due to surrounding public perception, which eventually died down. Furthermore, your model may not have been able to actually detect this large spike in data, which may be something important to consider.