There are quite a few variables in the XBT dataset that are not used in the iMeta algorithm or the first Leahy & Llopis ML paper. What is the relation between these variables and the already included variables of country, profile date and maximum depth. Some of this has already been explored for relationships between depths and profile type, but not other variables (as far as I know). Find these relationships would help decide whether including more variables (such as platform, institute, cruise ID, quality flag, lat/lon etc.) are likely to improve classification results.
There are quite a few variables in the XBT dataset that are not used in the iMeta algorithm or the first Leahy & Llopis ML paper. What is the relation between these variables and the already included variables of country, profile date and maximum depth. Some of this has already been explored for relationships between depths and profile type, but not other variables (as far as I know). Find these relationships would help decide whether including more variables (such as platform, institute, cruise ID, quality flag, lat/lon etc.) are likely to improve classification results.
Some potential stats: https://medium.com/@outside2SDs/an-overview-of-correlation-measures-between-categorical-and-continuous-variables-4c7f85610365