Closed fanli-gcb closed 5 years ago
That is indeed the case. The problem is saving the sample names to the hdf5 file. In the current HDF5 version, the strings are restricted to 50 characters. I couldn't find a way around it.
There should be a warning in prepareMOFA:
if (any(nchar(sampleNames(object))>50)) warning("Due to string size limitations in the HDF5 format, sample names will be trimmed to less than 50 characters")
However, there is a simple solution. Just edit the sampleNames manually after loading the model:
sampleNames(object) <- sample_names
make sure that the order is consistent
Thanks for the help! Here's the code I used for the workaround in case it's useful for anyone else (notice it is on featureNames instead of sampleNames):
featurenames <- MOFA::featureNames(MOFAobject) # prior to runMOFA
MOFA::featureNames(MOFAobject) <- featurenames[names(MOFA::featureNames(MOFAobject))] # after runMOFA or loadModel
I noticed that somewhere in the runMOFA function, the feature names are getting truncated and thereby creating non-unique names that breaks downstream code.
As an example, here are two of the original features:
After running runMOFA:
Any ideas on where this truncation is happening? I have narrowed it down to the runMOFA call, but not sure where within that function.
Thanks in advance for any help!