New Visualization: Correlation Heatmap The updated script adds a Correlation Heatmap as a new visualization (Visualization 3). It displays the correlation matrix for the numeric columns and uses a coolwarm color map.
This visualization does not exist in the original script.
New Visualization: Boxplot The updated script introduces a Boxplot (Visualization 4) for feature distribution based on the final_score column. Users can select a feature to create a boxplot comparing the feature against the final_score.
This visualization is not present in the original script.
New Visualization: Barplot for Aggregate Insights The updated script introduces a Barplot (Visualization 5) that shows the average values of a selected feature for each Celestial Body. It also rotates the x-axis labels by 45 degrees for better readability.
The original script does not include this barplot.
Check for final_score Column: The updated script checks whether the final_score column exists in the dataset. If it doesn't, it calculates it using predefined weights for various features (like iron, nickel, water ice, etc.).
In the original script, it assumes the final_score column exists and directly uses it.
Display Available Columns: The updated script displays the available columns from the dataset using st.write("Available Columns:", data.columns) right after loading the data.
This step is not present in the original script.
Displaying Updated Columns: After calculating the final_score in the updated script, it prints the updated list of columns using st.write("Updated Columns:", data.columns) to confirm the addition of the final_score column.
This check is absent in the original script.
Visualization 1 (Histogram): In the updated script, the color of the histogram is explicitly set to 'teal' in sns.histplot.
The original script does not specify a color for the histogram.
Visualization 2 (Pairplot): The updated script adds a color palette (palette="coolwarm") to the pairplot.
In the original script, the pairplot does not have a specified color palette.
Custom Display of Top Sites: The updated script introduces the use of style.background_gradient to apply a color gradient to the adjusted_score column in the table displaying the top sites.
In the original script, the adjusted_score column is displayed without any additional styling or color gradient.
Changes in pages/Visualize.py
New Library Import: The updated script imports numpy as np to create a mask for the correlation heatmap.
This import is missing in the original script.
New Visualization: Violin Plot A Violin Plot (Visualization 3) has been added to show the distribution of the selected feature by Celestial Body. It provides a comparison of the distribution of the feature across different celestial bodies.
The original script does not have this violin plot.
New Visualization: FacetGrid A FacetGrid (Visualization 4) has been added to compare the distribution of the selected feature across different celestial bodies. This uses sns.FacetGrid to create individual histograms for each celestial body.
The original script does not include a FacetGrid.
Mask for Correlation Heatmap: In the updated script, the correlation heatmap now uses a mask (mask = np.triu(np.ones_like(corr_matrix, dtype=bool))) to hide the upper triangle of the heatmap. This makes the heatmap easier to interpret by showing only the lower half of the correlation matrix.
The original script does not use this masking feature.
Regression Plot: A Regression Plot (Visualization 6) has been added to visualize the relationship between iron and nickel, including a trendline. This is done using sns.regplot to plot the data and trendline.
The original script does not contain this regression plot.
Color Palette in FacetGrid: In the updated script, the FacetGrid uses the color orange for the histograms of the selected feature, whereas the original script does not include a FacetGrid.
Consistency in Plot Titles: The titles of the plots in the updated version have been slightly reworded for clarity and consistency. For example, in the violin plot and the FacetGrid, the titles have been modified to specify the distribution by celestial body more explicitly.
Grid Customizations and Plot Titles: The updated version includes more consistent grid lines (plt.grid(True)) in all the visualizations and has made minor adjustments to the title font size and weight for better readability.
The original script already had grid lines and title customizations, but the updated version introduces more uniformity in these aspects.
Changes to Pie Chart: In the updated version, the pie chart for celestial body distribution has been given the same title and aesthetic as other visualizations (with bold title font and grid), which was slightly more basic in the original version.
Fix to #30
Changes in visualize.py:
New Visualization: Correlation Heatmap The updated script adds a Correlation Heatmap as a new visualization (Visualization 3). It displays the correlation matrix for the numeric columns and uses a coolwarm color map. This visualization does not exist in the original script.
New Visualization: Boxplot The updated script introduces a Boxplot (Visualization 4) for feature distribution based on the final_score column. Users can select a feature to create a boxplot comparing the feature against the final_score. This visualization is not present in the original script.
New Visualization: Barplot for Aggregate Insights The updated script introduces a Barplot (Visualization 5) that shows the average values of a selected feature for each Celestial Body. It also rotates the x-axis labels by 45 degrees for better readability. The original script does not include this barplot.
Check for final_score Column: The updated script checks whether the final_score column exists in the dataset. If it doesn't, it calculates it using predefined weights for various features (like iron, nickel, water ice, etc.). In the original script, it assumes the final_score column exists and directly uses it.
Display Available Columns: The updated script displays the available columns from the dataset using st.write("Available Columns:", data.columns) right after loading the data. This step is not present in the original script.
Displaying Updated Columns: After calculating the final_score in the updated script, it prints the updated list of columns using st.write("Updated Columns:", data.columns) to confirm the addition of the final_score column. This check is absent in the original script.
Visualization 1 (Histogram): In the updated script, the color of the histogram is explicitly set to 'teal' in sns.histplot. The original script does not specify a color for the histogram.
Visualization 2 (Pairplot): The updated script adds a color palette (palette="coolwarm") to the pairplot. In the original script, the pairplot does not have a specified color palette.
Custom Display of Top Sites: The updated script introduces the use of style.background_gradient to apply a color gradient to the adjusted_score column in the table displaying the top sites. In the original script, the adjusted_score column is displayed without any additional styling or color gradient.
Changes in pages/Visualize.py
New Library Import: The updated script imports numpy as np to create a mask for the correlation heatmap. This import is missing in the original script.
New Visualization: Violin Plot A Violin Plot (Visualization 3) has been added to show the distribution of the selected feature by Celestial Body. It provides a comparison of the distribution of the feature across different celestial bodies. The original script does not have this violin plot.
New Visualization: FacetGrid A FacetGrid (Visualization 4) has been added to compare the distribution of the selected feature across different celestial bodies. This uses sns.FacetGrid to create individual histograms for each celestial body. The original script does not include a FacetGrid.
Mask for Correlation Heatmap: In the updated script, the correlation heatmap now uses a mask (mask = np.triu(np.ones_like(corr_matrix, dtype=bool))) to hide the upper triangle of the heatmap. This makes the heatmap easier to interpret by showing only the lower half of the correlation matrix. The original script does not use this masking feature.
Regression Plot: A Regression Plot (Visualization 6) has been added to visualize the relationship between iron and nickel, including a trendline. This is done using sns.regplot to plot the data and trendline. The original script does not contain this regression plot.
Color Palette in FacetGrid: In the updated script, the FacetGrid uses the color orange for the histograms of the selected feature, whereas the original script does not include a FacetGrid.
Consistency in Plot Titles: The titles of the plots in the updated version have been slightly reworded for clarity and consistency. For example, in the violin plot and the FacetGrid, the titles have been modified to specify the distribution by celestial body more explicitly.
Grid Customizations and Plot Titles: The updated version includes more consistent grid lines (plt.grid(True)) in all the visualizations and has made minor adjustments to the title font size and weight for better readability. The original script already had grid lines and title customizations, but the updated version introduces more uniformity in these aspects.
Changes to Pie Chart: In the updated version, the pie chart for celestial body distribution has been given the same title and aesthetic as other visualizations (with bold title font and grid), which was slightly more basic in the original version.