Open jiwon-j opened 3 months ago
hi @jiwon-j, SOM is a multivariate model and you can build your input as a matrix where each row corresponds to a year and contains values from all the variables that you have.
These numpy functions can help you reshape your original data:
hi @jiwon-j, SOM is a multivariate model and you can build your input as a matrix where each row corresponds to a year and contains values from all the variables that you have.
These numpy functions can help you reshape your original data:
thank you! i made a combined array, but do the two variables in here have to have the same shape? Trying to run a SOM and getting broadcast issues
the input matrix needs to have only 2 dimensions, which means that you have to concatenate your data on the appropriate axis.
@jiwon-j I ran into a similar problem and flattening (vectorizing) the input data into a 1D vector is how I got mine to work. Then making your input_len
the length of one sample. You can see the later parts of #187 where show how I do this.
Unless you've found a way around it I would imagine that inputs need to be the same length (minisom requires a square matrix). Imputation may help with this?
Adds quite a bit of dimensionality but MiniSOM is able to handle this sort of data at the cost of dimensionality.
I have two datasets, geopotential height (GPH) and Precipitation. Each variable has the same time dimension (year). I want to do clustering for each year, and I was wondering if and how I can run SOM considering "both" variables (not doing SOM for each variable individually).
I tried with np.hstack, but this merges each variable array, so I'm not sure if it's accurate. (if GPH have (year:20, flatten_values:500) shape and Precipitation have (year:20, flatten_values:500), np.hstack made it into (year:20, flatten_values:10000, just attached it) I was wondering if this is even possible.