Open jamesallenevans opened 4 years ago
On page 117, Krippendorff writes "generalization is not a very important issue in content analysis," why? How should we interpret the importance of external validity when conducting content analysis?
(1) What is the difference between cluster sampling and stratified sampling? As far as I am concerned, cluster sampling is (for example) choosing all the passages in one journal. Why cannot we take the ‘journal’ level as one stratum and regard it as a stratified sampling? (2) Another question is about relevance sampling. How can we persuade others that our standards are really relevant? I mean, “alcoholism + student” is not necessarily better than “alcohol + student”. In research this may lead to many challenges. (3) Can we say convenience sampling is actually no sampling? (4) Last question is when do we need sampling…the last part of this reading (sampling experiments) is a little confusing to me, because if we have enough data to establish the benchmark and accomplish the experiments, why do we not use the population directly, but use a sample? We have the whole population anyway. The computational cost does not make a big difference in most social science studies.
As @clk16 , I also have troubles clarify the relationship between stratified sampling and cluster sampling both conceptually and methodologically. At a superficial level, one difference between them seems to lie in whether to choose a number of random subset from each cluster or randomly choosing several whole clusters. I wonder if another possible difference could be that stratified sampling is based on known categories (e.g., gender, race, urban/rural region, etc.), while cluster sampling could work on unknown and unnamed categories. For instance, we can used unsupervised clustering methods to discover some patterns from the data and do cluster sampling based on such results.
To build off of @clk16 's second question: I do find it problematic that in the example given for relevance sampling, the paper suggests that in some cases, it is beneficial to find terms that yield fewer Google results. To lower the number of results, would this not introduce bias as the researcher would be encouraged to search using the most unusual synonyms to their research topic as more unique words are more highly weighted in TF/IDF and other set-based IR methods?
On a more minute note, while understanding sampling is definitely important for our projects. I wish there was more specification regarding how to pick the size of 1 observation, e.g. when is it better to choose an entire newspaper as sample vs an article from a newspaper? As someone generally accustomed to working with numbers, I am having a hard time conceptualizing the best size for each observation in my project (should I break my data up by sentence? paragraph? conversation?).
Hi, i was thinking that the described sampling techniques are applied to text that was meaning to be presented in a written format (even speeches are planned). What about, let's say, text derived from people talking? Non-structured, spontaneuous creation of content. In this scenario, given the orality of context, it is possible that many words and ideas will be repeated much more times. How could I sample in this case?
I'm interested in the snowball sampling technique and the underlying effect of intertextuality. As a systematic way to find text that is related to the initial set of texts until a natural boundary is reached, snowball sampling can uncover texts that a researcher may not have originally thought to include.
However, are there biases inplicit in the intertextuality of the documents? Does snowball sampling obscure biases affecting the associations between texts from the researcher? The author mentions that researchers may have to compensate for biases in the Convenience Sampling section by taking into account how they obtained text. Since snowball sampling is implemented without much human interaction, how can biased associations in the texts be accounted for?
I have a question about systematic sampling on page 115. According to the definition of systematic sampling, it select every kth unit from a list after determining the start point of the procedure at random, and the interval k is a constant, I am curious about how to select the proper k to avoid creating biased samples? How to avoid the situation that it correlates with a natural "rhythm" or other unknown rules. Though this article demonstrates several examples, but they all based on the known "rhythm" such as the New York Time science section is published every Tuesday. What if we do not have prior knowledge about the cyclic regularities, and how could we identify the cyclic rules?
Krippendorff’s chapter on sampling addresses several criteria for how to improve the explanatory power of data. However, one important aspect seems to be missing: How can we question or break up the categories into which the specific institution (a state, a company) that has sorted the data? For example, until 1980, the U.S. Bureau of the Census classified as ‘head of the household’ exclusively the adult male when he and his wife were living together. Categorizations can carry and repeat a certain world view that can become more and more distant from social reality. In order to better understand social action, how can we question existing categories when they seem to be baked into the data?
I also have a question regarding relevance sampling. I think this approach can come in very handing when researching patterns such as changing trends in certain words and phrases in literature and journalism overtime, but there can be many problems in the details of applying this approach. Synonyms of certain phrases can be used in certain contexts that the researcher might not be aware of, and such content that can be potentially crucial information is not sampled. Furthermore, there might be irony involved in the sampled content that the researcher might not want to include. Are there common approaches to deal with the nuances of relevance sampling then?
My question is on snowball sampling. I was wondering if selection bias by sampling through the network of text is a concern in content analysis, especially since text in the network is likely to be more similar than text outside of the network? I am also wondering if it is possible to do a hybrid of sampling methods e.g. snowballing over different clusters identified and whether there are limitations/red flags we should be aware of in mixing the sampling methods?
I have a question with regard to Relevance sampling and Varying probability sampling. Krippendorff states that Relevance sampling is not probabilistic sampling in a sense that it just excludes textual units that do not possess relevant information. But couldn't we consider the process is similar to Varying probability sampling, assigning zero probability to them in a sense that they would not contribute to research questions?
The reading regarding sampling demonstrates that snowball sampling can be used for scholarly articles. A researcher can start with the literature on a certain subject and expand his sample by using its cited works. However, it is hardly the case that all the cited works are relevant to the discussed subject. Scholars are required to cite all the works whether they are major or minor to their research. However, simply reading through the references will not provide any additional insight on which work is more important and relevant to the main subject. Is there any systematic way to distinguish meaningful references from minor ones?
I'm wondering how the theory we're trying to test or the question that we ask could inform the sampling decisions that we make? The author mentions a study by Wells and King (1994) that examines the coverage of foreign affairs during a presidential campaign with sources from four newspapers that are ideologically closer to the Democratic Party. This seems problematic. On the other hand, if the question is whether mainstream US newspapers display a conservative bias on foreign policy relative to the Democratic candidates, using these outlets may suffice because if we can show these outlets are conservative even when there's a downward bias, then other newspapers are probably also relatively conservative. But our prior about the ideological leanings of these outlets may not be accurate. To what extent should we let our prior knowledge and assumptions guide our sampling decisions?
The author mentioned that "Snowball sampling ends when it reaches natural boundaries, such as the complete literature on a subject". However, in reality, researchers can hardly search and review all the relevant literature on a subject, especially if the subject contains thousands of researches. I'm wondering how should we define the "natural boundary" in this case?
Also, several other comments mentioned that using the Google search engine to find documents can be biased. The choice of keywords could largely affect the results we found. How should we address this issue during sampling?
I found the explication on varying probability sampling particularly interesting and relevant to popular concerns on contemporary mass media coverage. Would it be possible to work through additional ways researchers go about giving oft-silenced voices greater weight than mainstream ideas? For instance, I would pose assessing Oscar winners (as nominated by the Academy) versus the rankings of movies on IMDb as an example of two different routes of weighting. However, I am interested in additional examples of weighting normally silenced views and opinions.
I am wondering if there are researches using only relevance sampling and how they can create an objective and non-biased example. The author classified relevance sampling as one of the sampling plans but to some degree, it seems more like a step in selecting the sample than a full plan that can be applied to conduct a study. In addition, although the author justified that the method as not probabilistic, his description and examples seem to imply that these researchers are applying this method based on subjective perceptions of related issues so that their result can prove their hypothesis, rendering their result less credible. Is this a problem when relevance sampling is applied by analysts? If not, how the analysts avoided the cons?
(1) If someone wanted to identify the text that generated an idea --for example, wanting to find the origin of a certain slang phrase in text data--would snowball sampling, without an explicit stopping condition, be the appropriate technique to use? In other words, assuming we had a comprehensive enough data set*, could we reasonably infer that when the process generates no new references, that the last one is the origin point for the phrase (or, bring us close to the origin point)? If not, how is text analysis used to track language evolution**?
*ties into @bjcliang-uchi and @heathercchen 's questions about sample size and external validity
**In the reading, Krippendorff states that snowball sampling on content analysis literature terminated with a piece published in the 1600s, but sampling on the term "content analysis" stopped at a publication from 1941. If the 1600s piece actually did discuss content analysis but referred to it by another (semantically equal) term, how would we account for references like this that slip through the cracks?
One thing that makes sampling of text data really complex is the hierarchical, interconnected nature of text data. As Krippendorf mentioned in 6.1, united sampled are not unit counted--different levels of text/recording unit may exist in the text data. However, in the following sampling method section(6.2), it seems that most of the nine methods are aimed at a single level of text unit (e.g., news articles). Clustering/Stratified sampling may address some hierarchical problems (though I doubt they could only deal with simple, non-overlap categories). Do we have a better approach to solve this multilevel/multiunit problem? For example, sampling on one level while making the research still generalizable on the other level?
The two population the author mentioned in the text is very interesting, the population of ‘answers’ of a research question and the population of ‘texts’ that contain the answers to the question. Also, the two different units in text analysis— 'sampling units’ and ‘recording units, which means that the texts we sampled is not necessarily the units we will code, record and analyze. That is why relevance sampling and cluster sampling are more prevalent in text analysis, which takes the meanings and previous knowledges about texts into account.
Therefore, I was wondering, are those probability samplings still important in text analysis, since they are not performing on the actual units we are analyzing? and are there any gold standards when selecting ‘relevant’ samples, which is subjective. For example, when analyzing general language, there are so many corpora we can use, book, conversation, articles, etc. which we would better to select and how diverse we should select for better representation?
On page 120 of the article, in the final paragraph of the section on "Relevance Sampling", Krippendorff claims that the characteristic problems of this particular method have become more significant given "the increasing use of very large electronic text databases and the Internet" due to the inherent difficulty of parsing such massive amounts of data for relevance. How can computational methods of textual analysis be used to facilitate the otherwise manual processes involved in relevance sampling for such massive digital data sources? How might a researcher judge whether it would be more efficient to develop a computational relevance filter for their sampling or to perform this process manually?
These all seem to be fairly standard methods of sampling, modified in interesting ways to account for the four assumptions outlined on pgs. 112-113. What Krippendorf doesn't really explain, and what I'm interested in, are tests of validation for content analysis. Are there tests of significance, or something of the like in content analysis? I can imagine that once you've decided the populations, units, and relevances (i.e. you've assembled a corpra) and have reduced it to some numerical form according to your research question/preferred methods, you can derive p-values and the like. If that's the case, are Frequentist or Bayesian tests more common?
In the context of content analysis (especially with huge amount of text data), how to sample meaningfully while we do not know the text well enough given the enormous amount?
What is the difference and similarity between relevance sampling, snowballing sampling, and convenience sampling? The article says relevance sampling is purposeful and aims to answer very specific questions. But I assume snowballing or convenience samplings can also be designed such that they could answer specific questions. The relevance sampling sounds like searching on an academic journal database like a sociological abstract, and snowballing and convenience sampling soulds like clicking on wikipedia.
In the clustering sampling section, the reading mentions that a sample can be representative of the population of clusters (i.e: newspapers) but not being representative of the population of units (i.e: articles). In a cluster sample where such an issue happens, is there a way to "correct" the sample for a possible unrepresentativeness of the units? I can imagine that maybe some kind of unit weight is needed in this case, though I'm not sure of how these weights should be constructed. Thanks!
The chapter provides a good survey of existing sampling techniques. My question is whether there is a good rule of thumb when sampling should be used, for example, how large is the data that sampling is needed. Is it merely a limitation on computational resources that bars us from using the entire corpus (given no resource restriction) or is there a theoretical justification for sampling that after a certain size of the dataset, sampling is always preferred?
While I understand that snowball sampling can be applied to many different types of connections or assumed networks, if we wanted to apply this in some way such that we traversed multiple trees or "snowball sessions," would we need to employ another type of sampling such that our root nodes were distributed in some relevant way to our eventual interests. For example in scientific literature, similar to @chun-hu 's question, if we traverse some tree of citations, can we just assume it spans the entire networks and we're not stuck in a local community?
My questions from this article are here:
How can we think about the connection between research question and sampling when using content analysis? What ideas can help us to define the unit of analysis and hence to define a universe to sample from?
My question is: Could you please offer some occasion that sampling are needed for content analysis? If we regard an piece article as features of a unit, it seems that there is no need to sample and instead we just need to find a way to measure the whole article. Although we need to sample on unit of which we may be somehow unclear of the property of the whole population, this sampling seems to be similar to normal sampling. Sorry about my confusion and thank you!
I am interested in the varying probability sampling. I have seen this method in other social science research but I think when it comes to content analysis, the weight we assign to each different source can be trickier. For example, if we are surveying public opinion on climate change and we want to look at people's discussion on Facebook, Reddit, etc. Do we just compare the user number of each of these platforms and derive a weight from this ratio?
For sampling in the content analysis context, I am thinking of what would be considered as a valid sample if the texts come from heterogeneous sources. For example, if a study wants to analyze data from both Twitter and journal articles, should we construct two different samples or there is any way we can construct one corpus that includes texts from both of the sources?
I'm curious about how sampling would work if we were to build a panel of textual observations. Building off of the idea of intertextuality, how do we pick samples in t = 2 such that the networked predecessors in t = 1 are accurately represented?
I am concerned that snowball sampling, especially when applied in online media research, might lead to a sample size that is too large to manage. Is it usual in the content analysis field to stop it at some point (say, do not include posts before 2018) before it gets unmanageable?
Like @adarshmathew I am also interested in intertextuality and snowballing. The author listed a few interesting examples of intertextuality, such as networks of literary relationships (pg. 118) but when I think about intertextuality as a literary concept, relationships and references are not always so defined. Intertextuality, such as a reference to William Blake in Walcott's poem Ruins of a Great House, can be more like a nod to other works and concepts in the cannon. Therefore, constructing a sample here might require more context, such as knowledge of Walcott's influences or themes of post-colonialist reckoning.*
The author states that:
Snowball sampling starts with an initial set of sampling units, as I have noted-- and it is important that researchers choose these units wisely. (Krippendorf, pp 118)
I bolded "starts" because I wondered if the author was hinting at a way to construct a snowball sample automatically and not just manually via human inspection. Could this method still be applied to less defined literary intertextuality? Perhaps if we had a corpus of authors and their works, themes, and major characters, then we could search texts for such references to build a sample-- but what corpus could be exhaustive enough to be accurate? I'm mostly curious to see if others had similar thoughts?
When it comes to researching the past, newspapers are often seen as a source that can enhance our understanding of how a certain issue was commonly understood in the general public. Methodological debates focus on sampling. However, no matter how we sample from newspapers, there remains the issue of the general validity of them as a source. Contemporary debate often considers media bias. How could we use a measure of such bias to make newspapers from the past more useful as reflections of social reality of the time?
Considering the issue of external validity in content analysis, as is mentioned by @bjcliang-uchi, it sometimes seems to be true that generalizing the inference to a broader range of sample or social game is not the key emphasis of content analysis since most of the times what is of interest is just figuring out the specific social games underlying the contents. However, there can be chances that content analysis also focuses on summing up regularity or social patterns. In such cases, does external validity really depend on sample representativeness, or it may be more related to the research question per se. In other words, is using content analysis to reveal social games more like an observational approach or a statistical inference process?
The differences between traditional statistical sampling theories and sampling methods applied in the domain of content analyses really piques my interest. The latter sometimes does not conform to the rules which has been established well and examined in statistical areas, so I am wondering whether the difference stems from the speciality of content analysis or a kind of compromise considering the large size and the ambiguity of textual content.
I’m wondering which sampling method is more appropriate for social media posts and comments. I know some scholars have used opinion leaders as a way to sample the posts, but opinion leaders’ posts are by far representative of the whole sphere of discourses happening on social media sites. The article mentions varying probability sampling and I’m wondering ways of applying this approach to studying social media, but it’s not clear how to assign probabilities to various accounts and actors by reading the article. My other question has to do with relevance sampling. When do we know we have reached a “true” population of relevant texts and hence does not have to worry about the sample is not representative? And in the case of social media sites, how do we determine which posts are really irrelevant and hence we can use relevance sampling such as as using opinion leaders?
Friends: Please pose a question about the following reading and how to sample content for your projects:
Krippendorff, Klaus. 2004. "Sampling" in Content Analysis: An Introduction to its Methodology. Thousand Oaks, CA: Sage: “Sampling” 111-124.