rrodriguezbarron commented 3 years ago

Feedback

Great start!

Here are some comments (forgive the brevity and tone as these were stream of consciousness while reading):

Biggest flag is the use of the term "effect" in the intro. With UML, we never make predictions or estimate "effects" as we do in supervised tasks. Thus, to explore the relationship between affect and polarization, you'd have to do so descriptively, picking up on latent patterns and relationships that naturally exist in the feature space, rather than a supervised and (usually) parametric approach to estimating quantities of interest.

You are proposing to use a ton of algorithms. This isn't necessarily a "bad" thing in the normative sense. But it does make for a tall order for a project like this. You are welcome to proceed as planned, but a point to consider for the final write up is how exactly these methods build on each other, why they are useful in concert, and why these instead of the still many others you could be using? That is, much more justification for method selection is warranted here, to make for a more cohesive project.

How are you measuring polarization? There are so many ways this is measure in the literature. It seems like you're planning on using the feelings toward a particular political candidate. But note, this is a mere proxy for polarization, and really an imperfect one. What FTs really get at is more granular looks at political preferences, which may or may not be "polarized". If you use party ID or some other proxy for polarization, think really carefully about the substantive information underlying the construct. That is, if using self-identified party ID/partisan leanings, you are picking up on political identity, which might be used in a study of polarization, but is itself not a measure of polarization. There are further still more measures and approaches to exploring polarization out there. So, all of this to say, the measurement and exploration of "polarization" is a deceptively complex one. Many preferences and pieces of information might point toward polarization, but may do so imperfectly. So think really carefully about exactly what data you are modeling, and thus how best to interpret the patterns you pick up on.

You are using the time series version of the ANES. This is fine, but how exactly are you going to handle time? The algorithms you have mentioned don't explicitly model time or handle it traditional/explicit ways (e.g., econometric like error correction models or transfer functions, or on the ML side, LSTM/RNN, etc.). So think really carefully about time, as it may skew the patterns you find, and give a hint of structure in the data space, where perhaps none or different structure truly exists. I just took up this idea in a recent paper I published. Take a look here if you're interested: https://ieeexplore.ieee.org/abstract/document/9355031. Alternative to dealing with time, you might consider using a different ANES data set, like the 2019 ANES pilot study or the 2016 version. Check out the ANES website for more options: https://electionstudies.org/data-center/

Ultimately, after reading, I am not seeing a ton of methods or process that help you with your stated goal at the outset (media consumption and affect). There are certainly routes you could take, but as is written, the link between these concepts on the basis of the methods you propose is unclear. Overall, great start! Keep at it, and let me know if you'd like to discuss any point in greater detail. Of course, happy to do so if needed.

rrodriguezbarron commented 3 years ago

Ruben's Email

Ruben Rodriguez rrodriguezbarron@uchicago.edu Wed, Apr 21, 2021 at 1:00 PM To: Philip Waggoner pdwaggoner@uchicago.edu Cc: Ruben Heuer heuer@uchicago.edu, Spencer Ferguson-Dryden csfergusondryden@uchicago.edu, Tiancheng Pu gabrielpu@uchicago.edu

Philip,

Thank you so much for the feedback. It is really helpful and it was very comprehensive so we have a lot to do for the next step of the process. I just wanted to go over some of the points with you:

Descriptive, not causal, argument: You are right, we used the term "effect" wrongly in our presentation. In fact, we are not going to say what causes affect to change. Instead, we will focus mostly on the underlying patterns and relationships between affect and media consumption.
Data: We are not using the ANES time series data. We are only using the 2016 ANES version of the survey. So we do not need to worry about time at all.
Methods: We were thinking in broad terms because there's so much we can do with the data. Following your advice we will review and retain only the ones we need. Personally, I have never even heard about some of the methods before the class started so I'm still unsure about how each would help. I hope you don't mind if this part changes as we get to know and use more methods in class.
Polarization: We are not measuring polarization. In fact we are only measuring affect for groups and individuals. This is very well captured by the feeling thermometers. Thus we do not need to construct any measure of polarization. We do expect to find patterns with regards to party identification but that is not going to be the goal. Rather the goal will be to see what patterns emerge between affect (measured in feeling thermometers) and selection of media consumption.

Let me know if you think something could be further improved upon. Thanks again and see you next week.

PS I'm CCing my team for their archive.

Sincerely,

Rubén Rodríguez Barrón (The stress goes on the last syllable, like so: “ru-BEN”)
Political Science PhD Student | The University of Chicago

rrodriguezbarron commented 3 years ago

Philip's Response

Philip Waggoner pdwaggoner@uchicago.edu Wed, Apr 21, 2021 at 2:50 PM To: Ruben Rodriguez Barron rrodriguezbarron@uchicago.edu Cc: Ruben Heuer heuer@uchicago.edu, Spencer Ferguson-Dryden csfergusondryden@uchicago.edu, Tiancheng Pu gabrielpu@uchicago.edu Hi Ruben et al. -

Thanks for the reply. A few of my own where appropriate:

Re: data. Great! Glad time is not an issue. I would change this first sentence in your data section then and make sure you're all on the same page: "Data used for this model will come from the 2016 American National Election Studies (ANES) Time Series Study (2017)."
Absolutely, it is appropriate for your methods to change, adjust, and refine as your idea materializes and you get into the data. This is a normal part of any research project. Just try not to change substantive ideas too much, or this could create more work and other issues downstream.
Glad also you aren't measuring polarization. It's way too fraught with measurement and construct issues. In this case, I would try at all costs to omit the word from your project and avoid it. You might suggest hints of it if you find really clear separation between the parties in the space. But be very clear, if you do, that you aren't formally testing for or exploring polarization, if in fact you stick to this research proposal you've submitted.

Everything else sounds good! Onward.

By the by, next week is asynchronous. All will be up on Canvas explaining steps, assignments, etc.

And don't forget about the challenge being posted tomorrow morning.

All best, pw

--

Philip Waggoner https://pdwaggoner.github.io

rrodriguezbarron / UML-Project

Original Feedback on Project Proposal #3

Feedback

Ruben's Email

Philip's Response