jgx65 / hierfstat

the hierfstat package
24 stars 14 forks source link

basic.stats :Error in sHo/2/n : non-conformable arrays #22

Closed xinghuq closed 6 years ago

xinghuq commented 6 years ago

Hi, I used this function to calculate H0,Hs,Ht. If I used hierfstat data and genind data, it works, while If I eliminated some pops from hierfstat data. it doesn't work, reported :Error in sHo/2/n : non-conformable arrays. The data format wasn't changed, initially ,it has 16pops, I eliminated 12pops, it didn't work, and will report the error.

Still don't figure out what's wrong.

Please can you help to check why?

Best!

jgx65 commented 6 years ago

could you post a toy example reproducing the error? Thanks

xinghuq commented 6 years ago

Just figured out why this happened, sorry, I forget to change the levels of the new data frame. After read the code, I know the levels of the data could influence its running.
I am trying to calculate the hierarchical gene diversity (heterozygosity), so, manually eliminated other pops, and left some pops every time to give us the total Ht of a region, then region by region. it seems stupid. If you can also give us the hierarchical Heterozygosity in this package, it would be better. (I assume it's easy to give us these from varcomp, right?). But I am not good at programming, so, if you can write the scripts, it will provide us more statistical power in genetics. Thanks!

heatherturtle commented 5 years ago

Hello, I am having a similar problem, only my data frame has no levels. It works for 66 obs and 110321 variables but doesnt work for 38 observations. The only thing I did was pull out two species (for pairwise comparisons) and then adjust row names. The only thing different between the two data frames are the dims. I am not sure how I can reproduce this for you because the only difference is the number of observations (populations)

The error : Error in sHo/2/n : non-conformable arrays

The solution: When subsetting the data frame, you need to write to a csv and put quote = FALSE and then re-load OR you can utilize type.convert, for example:

df[] <- lapply(df, type.convert)

The latter will take less time if you are working with a very large dataset.

jgx65 commented 5 years ago

@heatherturtle, The likely cause of the error is that your levels variable (i.e. pop) is a factor, and because you removed whole levels without refactoring, it creates an error message. Assuming your pop identifier is Pop in a data frame dat, just type dat$Pop<-factor(dat$Pop) before calling basic.stats. See example below:

> library(hierfstat)
> dat<-sim.genot()  
> str(dat)
'data.frame':   150 obs. of  6 variables:
 $ Pop  : int  1 1 1 1 1 1 1 1 1 1 ...
 $ loc.1: num  32 32 21 22 23 23 33 32 22 32 ...
 $ loc.2: num  33 12 33 34 32 21 13 42 33 33 ...
 $ loc.3: num  41 11 11 41 14 11 44 21 14 11 ...
 $ loc.4: num  42 22 22 24 22 32 22 22 33 22 ...
 $ loc.5: num  41 41 22 24 24 24 44 22 42 42 ...
> table(dat$Pop) 

 1  2  3 
50 50 50 
> dat$Pop<-factor(rep(letters[1:3],each=50))  #convert Pop to factor
> table(dat$Pop)

 a  b  c 
50 50 50 
> basic.stats(dat) #works
$`perloc`
          Ho     Hs     Ht    Dst    Htp   Dstp    Fst   Fstp     Fis   Dest
loc.1 0.4800 0.4844 0.5513 0.0669 0.5848 0.1003 0.1213 0.1716  0.0091 0.1946
loc.2 0.3667 0.3718 0.3935 0.0217 0.4043 0.0325 0.0551 0.0804  0.0139 0.0517
loc.3 0.6067 0.6097 0.6538 0.0441 0.6758 0.0661 0.0674 0.0979  0.0049 0.1694
loc.4 0.6733 0.6222 0.7197 0.0974 0.7684 0.1462 0.1354 0.1902 -0.0821 0.3869
loc.5 0.5133 0.5629 0.7124 0.1495 0.7872 0.2243 0.2099 0.2849  0.0881 0.5131

$overall
    Ho     Hs     Ht    Dst    Htp   Dstp    Fst   Fstp    Fis   Dest 
0.5280 0.5302 0.6061 0.0759 0.6441 0.1139 0.1253 0.1768 0.0042 0.2424 

> dat1<-dat[1:100,] # remove pop c data
> basic.stats(dat1)

Error in sHo/2/n : non-conformable arrays  # the error you are getting

> dat1$Pop<-factor(dat1$Pop) # refactor Pop
> basic.stats(dat1) # no more error message
$`perloc`
        Ho     Hs     Ht    Dst    Htp   Dstp    Fst   Fstp     Fis   Dest
loc.1 0.46 0.4935 0.5438 0.0503 0.5941 0.1006 0.0925 0.1694  0.0678 0.1987
loc.2 0.47 0.4488 0.4685 0.0197 0.4882 0.0394 0.0421 0.0808 -0.0473 0.0715
loc.3 0.57 0.5795 0.6092 0.0297 0.6389 0.0594 0.0488 0.0930  0.0164 0.1413
loc.4 0.67 0.6385 0.6744 0.0360 0.7104 0.0719 0.0533 0.1013 -0.0494 0.1990
loc.5 0.58 0.6434 0.7120 0.0686 0.7806 0.1372 0.0964 0.1758  0.0985 0.3848

$overall
    Ho     Hs     Ht    Dst    Htp   Dstp    Fst   Fstp    Fis   Dest 
0.5500 0.5607 0.6016 0.0409 0.6424 0.0817 0.0679 0.1272 0.0191 0.1860