Closed kaisarea closed 2 years ago
Hi, thanks for the issue. The local segregation scores can't be negative, so you found a bug. The problem is that your variable is named "group", and the package doesn't deal well with that. If you use "race", for instance, the problem goes away:
library(tibble)
library(segregation)
options(scipen=5)
local_data = tribble(~SCHOOLID, ~race, ~count,
"100005_870", "WHITE", 669,
"100005_870", "BLACK", 12,
"100005_870", "HISP", 80,
"100005_870", "AIAN", 0,
"100005_870", "ASIAN", 2,
"100005_870", "PACIFIC", 16,
"100005_870", "TR", 25,
"100005_871", "WHITE", 703,
"100005_871", "BLACK", 12,
"100005_871", "HISP", 47,
"100005_871", "AIAN", 0,
"100005_871", "ASIAN", 2,
"100005_871", "PACIFIC", 0,
"100005_871", "TR", 27)
(mutual_local(local_data, "SCHOOLID", "race", weight = "count", wide = TRUE))
#> race ls p
#> 1: ASIAN 0.00003321619 0.002507837
#> 2: BLACK 0.00003321619 0.015047022
#> 3: HISP 0.03206493691 0.079623824
#> 4: PACIFIC 0.68502974604 0.010031348
#> 5: TR 0.00108653019 0.032601881
#> 6: WHITE 0.00054228911 0.860188088
Created on 2021-10-24 by the reprex package (v2.0.1)
I'll try to fix that issue soon.
It's working now, thank you!
Hello, I have the same problem, but I can't resolve it whit the names changes. In my case, the problem arises when I use the "se" argument and the function make the bias corrections. Here is the code:
library(tidyverse)
library(segregation)
base <- tribble(~ID_s, ~PRI, ~SEC, ~SUP,
1, 4, 4, 6,
2, 27, 34, 36,
3, 9, 15, 15,
4, 21, 33, 38,
5, 15, 23, 19,
6, 6, 8, 6,
7, 7, 14, 18,
8, 6, 8, 12,
9, 23, 34, 45,
10, 9, 16, 19
)
base |>
pivot_longer(cols = PRI:SUP, names_to = "EDU",
values_to = "n") |>
mutual_local(group = "EDU", unit = "ID_s",
weight = "n", se = T,
wide = T ) |>
select(ID_s, p, ls)
And this is my output:
ID_s p ls
1: 1 0.02539623 -0.072997966
2: 2 0.18315094 -0.002555072
3: 3 0.07269811 -0.024815143
4: 4 0.17362264 -0.010504312
5: 5 0.10986792 -0.004451141
6: 6 0.03701887 -0.019953732
7: 7 0.07281132 -0.010572342
8: 8 0.04958491 -0.036701383
9: 9 0.19315094 -0.004720493
10: 10 0.08269811 -0.017325337
The problem disappear when I select "se = F".
Thank you!
Hi, yes that can happen when your sample is small. Basically this means that your ls
scores are most likely exactly zero. I could probably just set them to 0 manually if this occurs, but I think this is probably more transparent. This is just something that can happen with the combination of bootstrap and bias correction when the parameters are close to 0. Maybe it would be good to have a FAQ entry about this, though.
Perfect. I did this manually but was not sure if it was correct. Thank you for your response and your work with this package!
Hello, I have the following 'dataset' called local_data (trying to create a reproducible example here):
Then I run:
mutual_local(local_data, "SCHOOLID", "group", weight = "count", wide = TRUE)
I get the following output:My question is how does one interpret negative values from the mutual_local() function? I actually even had all components being negative (I can try to create a reproducible example for that too if needed). What is the interpretation of a zero, positive, and negative values here?