Closed Adeeshaj closed 11 months ago
columns meta data
2450 entries, 0 to 2456 Data columns (total 18 columns): count | Column | Non-Null Count | Dtype |
---|---|---|---|
0 | id | 2450 non-null | int64 |
1 | listing_url | 2450 non-null | object |
2 | title | 2450 non-null | object |
3 | location | 2450 non-null | object |
4 | price | 2450 non-null | float64 |
5 | price_currency | 2450 non-null | object |
6 | listing_date | 2450 non-null | object |
7 | description | 2450 non-null | object |
8 | Brand: | 2450 non-null | object |
9 | Model: | 2450 non-null | object |
10 | Mileage: | 2450 non-null | object |
11 | Body type: | 2238 non-null | object |
12 | Condition: | 2450 non-null | object |
13 | Fuel type: | 2450 non-null | object |
14 | Transmission: | 2450 non-null | object |
15 | Engine capacity: | 2450 non-null | object |
16 | Year of Manufacture: | 2450 non-null | object |
17 | Trim / Edition: | 1970 non-null | object |
other than Trim /Edition other fields have 100% data. Trim / Edition have 80% data all data good to analyze
Price Analysis
Since the main target is price analysing. Here we only do on price column
Histogram: To visualize the distribution of a single numerical variable, you can use a histogram.
Kernel Density Estimate (KDE) Plot: A KDE plot estimates the probability density function of a continuous variable.
Box Plot: To visualize the distribution of a numerical variable or compare distributions between different categories.
Violin Plot: Similar to a box plot, but also shows the probability density of the variable at different values.
Removing Outliers
Looking at the boxplot chart and the violin plot charts Here clearly there are potential outliers.
removing outliers charts are looks more meaningful here
Analysing categorical fields
Brand count 2450 unique 47 top Toyota freq 673 Name: Brand: , dtype: object
Model count 2450 unique 314 top Alto freq 134 Name: Model: , dtype: object
Mileage count 2450 unique 729 top 100,000 km freq 49 Name: Mileage: , dtype: object
Body type count 2238 unique 7 top Hatchback freq 786 Name: Body type: , dtype: object
Condition count 2450 unique 3 top Used freq 2404
Fuel type count 2450 unique 6 top Petrol freq 1688
Transmission count 2450 unique 3 top Automatic freq 1519 Name: Transmission: , dtype: object
Engine capacity count 2450 unique 184 top 1,500 cc freq 494 Name: Engine capacity: , dtype: object
Year of Manufacture count 2450 unique 59 top 2015 freq 235 Name: Year of Manufacture: , dtype: object
Trim / Edition count 1970 unique 1253 top Toyota freq 38 Name: Trim / Edition: , dtype: object
Analysing categorical fields - Grouping and Aggregation (mean - price)
Analysing categorical fields - Pie Chart
Explore the dataset to understand its characteristics, such as data types, distributions, and potential outliers. Create summary statistics and visualizations to gain insights into the data.