DamonCharlesRoberts / judiciary_nominations

An academic project examining the differences in topics and rates of interruption of POC and Female judiciary nominees
Creative Commons Attribution 4.0 International
2 stars 0 forks source link

Update file names for existing transcript data #26

Open DamonCharlesRoberts opened 1 year ago

DamonCharlesRoberts commented 1 year ago
madelinemader commented 11 months ago

For the transcripts where we need to split them, what information exactly is needed? For example, the transcript “bryant_09-26-2006” includes the hearing for Vanessa Bryant and Michael Wallace. But the way the hearing is organized, Bryant and Wallace are presented, then give statements, then there’s witness statements, then the written question and answers. So for this one, do I just need to pull the section for Bryant from “statements of the nominees” or is other information needed?

Screenshot 2023-09-25 at 10 58 13 AM Screenshot 2023-09-25 at 10 58 22 AM
DamonCharlesRoberts commented 11 months ago

Hmmm, good question. Some of these are a lot more messy than I had realized in the past. So I think this is forcing us to make a decision here.

In terms of interruptions, we ONLY want the text from the transcripts of the hearing -- what was said in real-time during the hearing.

I do think, that there is something interesting to be gleaned from the written remarks and the questions they were asked in regards to our second question in the project which is, "Is the topic of the questions and answers between nominees based on their gender and racial/ethnic identity different?" I think that we could get some useful insights there when considering the full text -- which I am realizing for a lot of these we had used in the past.

So, my initial thoughts are to just keep everything. Our model for interpretations doesn't standardize, it just takes raw counts of interruptions -- which won't happen in the written statements -- so it won't influence our results there. But if we keep the stuff that comes with the transcripts, then we can say that the confirmation process is different between male and female and non-POC and POC nominees, not just the hearings. So for our discussion of the topic models and stuff would need to be broadened to the whole confirmation process rather than just the hearing, but yeah.

What say you @madelinemader and @tylerpgarrett?

madelinemader commented 11 months ago

But is the relevant portion only the spoken statements for the nominees? In the above example, there are two nominees, and both of their spoken statements come before the written statements and before spoken witness statements but after spoken statements from the senators. So for the purposes of splitting the transcripts for the two nominees in this session, should i just pull the spoken statements of the nominees portion?

DamonCharlesRoberts commented 11 months ago

Yeah, so the thing that is most relevant are the spoken statements. So if we can't split things up very cleanly between nominees, we should focus on splitting on what we can in the spoken statements as best as we can