ajitjohnson / imsig

Immune Cell Gene Signatures for Profiling the Microenvironment of Solid Tumours
26 stars 7 forks source link

Error in fastCor(t(exp)) : invalid nSplit: 0 #8

Closed NiRuff closed 4 years ago

NiRuff commented 5 years ago

Hi Ajit,

when using your package with your example data, everything works fine.

When applying it to a subset of our own data, there are several problems.

I removed rows with no variance using data <- data[apply(data, 1, var, na.rm=TRUE)!= 0.0, ]

I also removed all GeneSymbols containing "-" to make sure that no symbols like Gnai3-201 remain in the dataset.

Applying imsig(data, r=0.0) gives:

---> Maximum number of splits: floor(n/2) = 0
---> WARNING: number of splits nSplit > 0
---> WARNING: using maximum number of splits: nSplit = 0
Error in fastCor(t(exp)) : invalid nSplit: 0

My data in the data.frame look like this:

;1; 2; 3; 4; GDF15 ; 2.252020e-01; 2.139837e+02; 6.835993e+00; 4.944126e+01

I currently use a subset of about 200 genes with 30 columns. I also made sure that there are no duplicate GeneSymbols in the data.

Thanks in advance, Nicolas

ajitjohnson commented 5 years ago

Hi Nicolas,

Can you try with all genes? I am not sure how many ImSig genes are present within your chosen 200 genes. Can you also make sure there are no NA's in the dataframe?

Also, check this out- Someone else had the same issue. https://github.com/ajitjohnson/imsig/issues/1

Best, Ajit.

NiRuff commented 5 years ago

Hi Ajit, Thanks for your answer. I already removed NA values, this was not the problem. I increased the size of the subsample to 2000 genes but got the same error. When increasing it to 8000 it worked, so I guess I had to use many genes to have sufficient ImSig genes involved.

I did the subsampling, because my PC crashed whenever I used gene_stat on the whole dataset, but the other functions I used worked out well.

So thank you very much!

Best regards, Nicolas

NiRuff commented 5 years ago

Are the non-ImSig genes somehow considered for the calculations? If not, I could also do the subsampling based on the list of ImSig genes - so only keeping those rows which has a GeneSymbol matching an ImSig gene. This should speed up all calculations, right?

Best, Nicolas

ajitjohnson commented 4 years ago

Hi Nicolas, the non-sig genes are not considered during the calculation.