git clone https://github.com/xingjian-bai/WiDS
conda env create -f environment.yml
conda activate wids
git add .
git commit -m "some message"
git push
2 character feature: start date, climateregions_climateregion
8 Variables contain missing data
Plots on target against the start date
we have no sufficient evidence to say that this distribution is sampled from the Normal Distribution population
Target distribution by Regions
perhaps we would build a model for each Region separately and check the result!
Target Variable and Relative Humidity Time series
Target Variable and precipitation Time series
Relative Humidity and Target variable relationship by Climate Region
I would do a lot of exploration for this Variable (Climate Region)
submit = pd.read_csv('/kaggle/input/widsdatathon2023/sample_solution.csv')
submit[target] = model.predict(X_test)
submit.to_csv('submission.csv', index = False)
The current notebook is based on lgbm model Although a "location feature" was created in the original notebook from the latitude/longitude coordinates, different locations were obtained between the training and test data.
In this notebook, we have corrected this problem.