calum-chamberlain / ESCI451-Python

Introduction to Python for VUW ESCI 451 course.
GNU General Public License v3.0
11 stars 5 forks source link

Singular co-variance matrices result in error in SNS kdeplot #24

Closed calum-chamberlain closed 1 year ago

calum-chamberlain commented 2 years ago

In Module_3B, in the final exercise the students select their own horizons. Selecting horizons G, N and R results in a linAlg error due to a singular co-variance matrix. It look like this stems from the MnO column.

Strangely, using horizons A, N and R does not result in the error, but does result in warnings if warn_singular=True is enabled. Changing the order of the horizon filter to N, A, R results in the error though...

The code to reproduce this is:

geochem = pd.read_csv("data/Swallow et al CMP glass data for plotting.csv")

#filter to select horizons
horizon_filter = (geochem['Horizon'] == 'G') | (geochem['Horizon'] == 'N') | (geochem['Horizon'] == 'R')

#parameter investigation
fig = sns.PairGrid(geochem[horizon_filter], hue='Horizon', diag_sharey=False)
fig.map_lower(sns.kdeplot, warn_singular=False, common_norm=False)
fig.map_diag(sns.kdeplot, lw=2, warn_singular=False)
fig.map_upper(sns.scatterplot)

plt.show()
calum-chamberlain commented 2 years ago

Adding the line:

geochem = geochem.drop(columns=["MnO"])

removes these issues.

calum-chamberlain commented 1 year ago

Added note to help students work around error message.