ankitaS11 / Crop-Yield-Prediction-in-India-using-ML

The model focuses on predicting the crop yield in advance by analyzing factors like district (assuming same weather and soil parameters in a particular district), state, season, crop type using various supervised machine learning techniques. This helps the farmers to know the crop yield in advance to plan and choose a crop that would give a better yield.
30 stars 19 forks source link

regarding co-relation of data #1

Open AZAD7337889873 opened 1 year ago

AZAD7337889873 commented 1 year ago

this message is regarding the corelation of data set you have done , can you help me understand it , as i are unable to get it , i also need help in what are you ultimatly trying to achieve throught it.

sheefanaaz123 commented 1 year ago

Hello! The owner has done a bivariate analysis - ( the analysis of two input/output variables, for the purpose of determining the relationship between them). For this, the owner has used correlation ( a statistical measure that expresses the extent to which two variables are linearly related - meaning they change together at a constant rate).

In the dataset, using corr() [ a function in pandas ], owner found the relationship between the crop year, area, production, yield with each other. We can see a matrix is returned by the corr(), which is made up of correlation coefficient ( a statistical measure of the strength of a linear relationship between two variables. Its values can range from -1 to 1). For example here, between crop year and production, the coefficient is 0.006989 which means if the value of the year goes up, the value of the production will also goes up. Similary negative coeffiecient shows if the value of one variable goes up, the value of other will goes down. And if the value is 0 that means two variables are not related or dependent on each other.

After that author made a correlation matrix for visualization.

Basically,it is one of the tools for our data analysis. It helps us to understand the relationship between each variable. We gain insights for further analysis of data.

You can refer the below tutorial:

https://www.datacamp.com/tutorial/tutorial-datails-on-correlation