for 48, 49 and 50, I would suggest adding and body_mass_g is not null to the where clause. Without it, penguins without a measurement (body_mass_g is null) are being classified as large (48 & 49) and abnormal (50). Not clearly specifying how to handle null values (best case, filter them out of the analysis) is a common error in data wrangling that can easily lead to incorrect results. Classifying 1 Adelie as 'large' (>5000 g) when no measured Adelie is even at 4800 g is an example of that.
for 48, 49 and 50, I would suggest adding
and body_mass_g is not null
to thewhere
clause. Without it, penguins without a measurement (body_mass_g is null) are being classified as large (48 & 49) and abnormal (50). Not clearly specifying how to handle null values (best case, filter them out of the analysis) is a common error in data wrangling that can easily lead to incorrect results. Classifying 1 Adelie as 'large' (>5000 g) when no measured Adelie is even at 4800 g is an example of that.