langcog / experimentology

Experimentology textbook
https://langcog.github.io/experimentology/
Other
42 stars 18 forks source link

probability sampling #189

Closed mcfrank closed 1 year ago

mcfrank commented 1 year ago

@mayamathur we had:

Sampling strategies are split into two categories: **probability sampling** -- in which every member of the population has some chance of being selected -- and **non-probability sampling** -- in which there are some members of the population that simply cannot be selected 

You wrote: "MM: Have never heard the terms used this way. In my fields, "probability sampling" is defined as various forms of marginally or conditionally random sampling. One can sample such that "every member of the population has some chance of being selected", yet in such a way that the sample is not representative of any meaningful group at all."

I agree that ours is misleading (actually, https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5325924/ makes this same mistake in their definition and has 700 citations!).

But I think it's right to say that probability sampling is sampling in which every member of the population has some known chance of being selected, right? It can be simple random or conditionally random. Are you OK with that edit?

mayamathur commented 1 year ago

Yes, how about this: "the probability that each member of the population is selected is known and decided a priori" to clarify that we're not talking about positivity wrt each person having probability > 0 of being sampled?

On Thu, Sep 14, 2023 at 5:41 PM Michael Frank @.***> wrote:

@mayamathur https://github.com/mayamathur we had:

Sampling strategies are split into two categories: probability sampling -- in which every member of the population has some chance of being selected -- and non-probability sampling -- in which there are some members of the population that simply cannot be selected

You wrote: "MM: Have never heard the terms used this way. In my fields, "probability sampling" is defined as various forms of marginally or conditionally random sampling. One can sample such that "every member of the population has some chance of being selected", yet in such a way that the sample is not representative of any meaningful group at all."

I agree that ours is misleading (actually, https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5325924/ makes this same mistake in their definition and has 700 citations!).

But I think it's right to say that probability sampling is sampling in which every member of the population has some known chance of being selected, right? It can be simple random or conditionally random. Are you OK with that edit?

— Reply to this email directly, view it on GitHub https://github.com/langcog/experimentology/issues/189, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACEE6HWV4YFHKIJE3JINT2LX2N2YRANCNFSM6AAAAAA4Y4XFSY . You are receiving this because you were mentioned.Message ID: @.***>

--

Maya Mathur Assistant Professor Quantitative Sciences Unit https://med.stanford.edu/qsu.html, Biomedical Informatics Research Division Associate Director, Center for Open and Reproducible Science https://datascience.stanford.edu/cores Stanford University Website http://www.mayamathur.com

https://med.stanford.edu/profiles/maya-mathur

mcfrank commented 1 year ago

OK, that sounds good. I tried to use your definition but clarify in more detail:

Sampling strategies are split into two categories. Probability sampling strategies are those in which each member of the population has some known, pre-specified probability of being selected to be in the sample -- think, "generalizing to Japanese people by picking randomly from a list of everyone in Japan." Non-probability sampling covers strategies in which probabilities are unknown or shifting, or in which some members of the population could never be included in the sample -- think, "generalizing to Germans by sending a survey to a German email list and asking people to forward the email to their family."

Feel free to reopen the issue if you have edits!