Simplified, MNAR (Missing Not at Random) is a type of missingness in which the probability of a value being missing is conditional (in whole or in part) on unobserved data. Missingness may be simultaneously conditional on observed data in addition to unobserved data.
Implementation: Generate a random selection of new features and base missingness on these features. The number of features to generate may be based on some fraction of the existing features, or a random number between 1 - n_features. These features could (should?) be a mix of continuous & categorical; this could be based on the fraction of each respective feature type in the existing features. Once generated, impose missingness based on these new features.
Be sure that functions accept & return matrices.
Be sure to follow the 4 steps outlined in contributing.md
The below labels are for DDFG (Data Days for Good) participant reference:
Priority: High
Difficulty: Medium
Complete
mnar
method in theCorruptor
class.Simplified, MNAR (Missing Not at Random) is a type of missingness in which the probability of a value being missing is conditional (in whole or in part) on unobserved data. Missingness may be simultaneously conditional on observed data in addition to unobserved data.
Implementation: Generate a random selection of new features and base missingness on these features. The number of features to generate may be based on some fraction of the existing features, or a random number between 1 - n_features. These features could (should?) be a mix of continuous & categorical; this could be based on the fraction of each respective feature type in the existing features. Once generated, impose missingness based on these new features.
Be sure that functions accept & return matrices. Be sure to follow the 4 steps outlined in contributing.md
The below labels are for DDFG (Data Days for Good) participant reference: Priority: High Difficulty: Medium
https://github.com/eltonlaw/impyute/blob/2c25368576558374d385293f65c883a91dff5027/impyute/dataset/corrupt.py#L48-L50