Team discussion - what do we want to do with [deleted]?

We have to come to a decision about how to handle [deleted] comments and users. The easiest thing to do is to ignore them, but if we remove them from the data then it will introduce errors into dynamic network construction processes. We would be analyzing network structures that don't accurately represent the structure of the conversation that actually took place. Furthermore, as @tcrick pointed out, [deleted] comments may be indicative of speech acts that violate conversational norms that are central in deliberative theory, etc. ("People don't delete comments for, like, no reason..." - @tcrick) For example, we might think of `[deleted]' as a proxy for somebody "leaving a conversation after the fact," which is closely related to the idea I've floated of labelling things like "conversational exits" with few shot learning, e.g., using the setfit approach with sentence transformers.)

While we can't ethically recover any deleted information, we have plenty of options for how we want to handle [deleted] for any given task in the pipeline. We may want to handle [deleted] on a task-by-task basis rather than globally.

mclevey / podlm

Team discussion - what do we want to do with [deleted]? #18