AlexsLemonade / compendium-processing

A series of analyses related to refine.bio species compendia
BSD 3-Clause "New" or "Revised" License
0 stars 0 forks source link

Aggregate and mask data #5

Closed jaclyn-taroni closed 6 years ago

jaclyn-taroni commented 6 years ago

To test imputation strategies, we need true or gold standard values. Here, I've performed an inner join and then put together two strategies for masking values:

1) Missing completely at random - 30% of values are randomly selected and replaced with NA 2) What I'm calling "missing random rows" which I feel is slightly more realistic

jaclyn-taroni commented 6 years ago

@cansav09 I've made the changes to lapply you mentioned in https://github.com/AlexsLemonade/compendium-processing/pull/5/commits/422d6399643e37630c53386f6e9009ca3b6b2732. Let me know if you need me to clarify anything else in my responses!