Open pdimens opened 5 years ago
Thinking about it some more, maybe something like a findall
for the missing
values, get an array of those indices, then omit the missing
with skipmmissing(array) |> collect
, calculate the correction, and finally re-add missing into the output array at the original indices with insert!
?
Here is how I handled the situation in my own code. I don't know if it would merit adding to your package:
# make a copy without the missing values
p_no_miss = skipmissing(P_array) |> collect
# get indices of where original missing are
miss_idx = findall(i -> i === missing, P_array)
# do the correction
correct = adjust(p_no_miss, correctionmethod) |> Array{Any,1}
# re-add missing to original positions
for i in miss_idx
insert!(correct, i, missing)
end
Thanks for bringing up the handling of missing values. Your approach looks good to me, not sure if there is a more elegant way of removing and reinserting missing values exists. It is definitely worth exploring if missing values should better be handled by the adjust
methods themselves.
@pdimens Just to understand your case a bit better: How did you generate the original p-values and why are some values missing?
That's a pretty fair question. The p-values were generated with a chi-squared test. When performed on all the data, it works ok, but if the data is partitioned by group, some groups have a particular locus (genetics work) entirely missing, which I also didn't realize would have happened.
The actual code is here: https://github.com/pdimens/PopGen.jl/blob/master/src/HardyWeinberg.jl if the specific implementation matters.
Okay, thanks for the details - that is interesting to see.
Having learned quite a bit since opening this issue, the PR submitted performs this a lot more elegantly than the code suggested above.
Is there a simple(ish?) method to perform the correction but skip
missing
values, and output the corrected array withmissing
respecting their original indices (but not used in the calculations)?Reading that back to myself, it doesn't feel like it's worded too clearly, so maybe an example: