A DSCI 525 project by Group 3 to predict rainfall in NSW, Australia based on big datasets
Our goal is to develop and deploy cloud-based ensemble machine learning model for future rainfall prediction in NSW, Australia. The datasets we used contain rain (mm/day) over time observed or computed by different models, retrieved on figshare. The datasets were loaded and combined together using pandas
and dask
, and underwent exploratory data analysis using both python
and R
. The data will then be used for big data machine learning model building and deployment to predict future rainfall in Australia.
Python 3.8.3
re == 2.2.1
requests == 2.25.1
json == 2.0.9
pandas == 1.2.3
dask == 2021.3.1
rpy2 == 3.4.3
pyspark == 3.1.1
s3fs == 0.6.0
joblib == 1.0.1
matplotlib == 3.4.1
sklearn == 0.0
json5 == 0.9.5
urllib3 == 1.26.4
flask == 1.1.2
R 4.0.2
arrow == 3.0.0
dplyr == 1.0.3
We welcome and recognize all contributions. You can see a list of current contributors in the contributors tab. UBC MDS DSCI 525 Group III:
CMIP6 Experimental Design and Organization https://www.wcrp-climate.org/wgcm-cmip/wgcm-cmip6
Pangeo Coupled Model Intercomparison Project Phase 6 https://pangeo-data.github.io/pangeo-cmip6-cloud/
SILO - Australian Climate Data https://www.longpaddock.qld.gov.au/silo/