MEDSL / 2022-elections-official

Official returns for the 2022 Midterm Elections
15 stars 3 forks source link

How can I be helpful? #17

Closed NickCrews closed 7 months ago

NickCrews commented 7 months ago

I'm worried from https://github.com/MEDSL/2022-elections-official/issues/15 that I haven't been contributing in a way that is helpful. I want to apologize, and want to set us up so that we can work together better. I also want to be super clear that I really appreciate the work that you are doing here. I also really hope that all these issues I'm filing don't feel like "here take this task I am dumping on you". I hope they feel like "hey I noticed this problem, but need some feedback or context here because I don't have enough resources to solve this myself. Lets both put in some effort to get to the bottom of this." If I should be doing somethign else to make this more enjoyable, please let me know here!

code snippets

I have been posting code snippets because they are unambiguous ways of showing the steps I am doing to reproduce the issue. I admit they aren't immediately reproducible, because 1. you might be working in R instead of python 2. you need to have some understanding of ibis 3. I am not showing the code that actually loads the data.

Is there some change I could do to make this more useful? Should I just skip them entirely?

actionable vs flags

In that linked issue, you mention that you are looking for issues where I engage with "why" something is the case, which is totally understandable. It is exhausting for someone to dump this problem on your lap with no solution. I am a little limited though here in what I can actually do to help find solutions:

  1. the scripts that do the processing aren't released yet, so I can't debug upstream. If these could get put on github then I would take more initiative here.
  2. It is not always clear what the intended behavior is. Notes in the readme are not always the most trustworthy. For example, in https://github.com/MEDSL/2022-elections-official/issues/15 you are right that in the readme Michigan is flagged as having duplicates, but I also mention that other states, such as Connecticut also has these duplicates and there is no such flag in the readme for some of these other states. My read of this situation was that MI was as-expected, but the QA checks either 1. were not run on these other states or 2. didn't include a check for this. As a software engineer I have basically become skeptical of the integrity of any human-run process (especially my own), if there isn't an automated test for it then I don't trust it happened ;)

I'm not sure what could change here. The ideal thing for me would be if the scripts and tests were released, and then I could self serve them, as well as push fixes to them. But that might be more hassle than you want to take on? IDK, I think without this I can't be much more useful.

locking issues

It has been a little hard to collaborate when issues get locked, making it impossible for me to comment further. eg on #15 I would love to continue the discussion on the states besides MI that have duplicates but aren't flagged in the readme. In general closing an issue is fine once you think you have the issue solved (I can still back and comment), but locking it is sort of a nuclear option that makes it impossible to come back and correct any misunderstandings, like this was. IDK does that seem reasonable?

Thanks for your time and effort!

sbaltzmit commented 7 months ago

Thanks for the comments. I appreciate your reports, and the data are better because you identified several problems. I'll keep working to address the ones you've raised that I haven't gotten around to yet. It's a generous use of your time and one that we certainly are grateful for.

In the Issue you reference, I was only asking that Issues be supported by a reasonable attempt to verify that the problem originates with or is limited to our dataset, that it is not addressed in the readme, and that it is not an issue that is present in the data source (especially if that source is official election result data). Of course there might be good reason to share comments that are primarily about some feature of a state's election result reporting, or commentary about our process for cleaning data. But those would be better directed to me by email at sbaltz@mit.edu, so that we can keep this space focused on plausibly resolvable problems in this repository of 2022 precinct-level election results. That helps me be more prompt about addressing the issues that we can remedy. But of course, thank you again for energetically identifying several opportunities for improvement, and helping us share high-quality data.