MetOffice / XBTs_classification

Project for the classification of eXpendable Bathy Thermographs
BSD 3-Clause "New" or "Revised" License
4 stars 2 forks source link

Calculate some stats for relationships between variables #16

Closed stevehadd closed 3 years ago

stevehadd commented 4 years ago

There are quite a few variables in the XBT dataset that are not used in the iMeta algorithm or the first Leahy & Llopis ML paper. What is the relation between these variables and the already included variables of country, profile date and maximum depth. Some of this has already been explored for relationships between depths and profile type, but not other variables (as far as I know). Find these relationships would help decide whether including more variables (such as platform, institute, cruise ID, quality flag, lat/lon etc.) are likely to improve classification results.

Some potential stats: https://medium.com/@outside2SDs/an-overview-of-correlation-measures-between-categorical-and-continuous-variables-4c7f85610365

stevehadd commented 3 years ago

This has been superceded by #103 involving feature importance