jmarkgraf / PresentationAssignment

0 stars 0 forks source link

Comments: R file #2

Open jmarkgraf opened 8 years ago

jmarkgraf commented 8 years ago

Hi Malte, here my comments on the R file:

These are my first comments on the R file. Hope that is helpful for your next steps!

mberneaud commented 8 years ago

First of all: Thanks for the extensive comments on my file, especially the highlighting of errors in the order of subsetting / merging.

Some clarifications regarding my steps:

variable "TotalTerms" (line 82-86): variable does not tell us whether it is mayor's first term or maybe already her fifth term; this, however, is crucial as we presume that re-election is more likely the more terms the incumbent has served as a mayor.

I agree that this is a real downside of the variable. I did this because I didn't find any way of calculating the number of preceding terms for each row in the data set. I remember you telling me that Christopher added such functionality into his DataCombine package after you discussed this bilaterally. Could you please share the code which you used to create term variable using the GitHub version of DataCombine on here or in a GitHub Gist with me, so I could include it into our R source file?

variable "Year" (line 88-89): what is the value added of this variable?

As I didn't convert the date strings into R's date format using lubridate, I extracted the year numbers from the date string and coerced them into an integer to be used in the subsetting. While this might not be the most parsimonious way of doing it, I'll leave it "as is" as it should suffice for our analysis, which is conducted on the year-level.

variable "Reelection" (line 97-103): our dataset is not ordered by municipality ID, but by incumbent name, which leads to a wrong estimation - you need to order it by IDMunicipality (1st order) and election year (2nd order) before you do the steps

You're absolutely right about this. Thanks for highlighting this issue. I have rearranged my code for creating the Reelection variable and all the lagged dependent variables after the subsetting of the data set and order the data prior to merging.

you need to first exclude 1 round of the runoff election (either the 1st round that led to the runoff or the 2nd round, which is the outcome of the runoff election)

I have excluded those elections which are coded "3" for ElectionType. This excludes all first rounds where run-offs were necessary and instead uses the result of the run-off election for the year under scrutiny. While this probably overstates the margin by which the candidate won, I see this to be better than the danger of falsely declaring those candidates winners, who were leading in the first round but who lost in the run-offs.

you might want to slide the gender variable, too. you need to slide the vote share variable of the winner. I've done that in the source code, but you might have just missed it given the length of the document. The code I used is below.

MayorElection <- slide(MayorElection, Var = "VoteShareWinner", TimeVar = "ElectionDate", NewVar = "L.VoteShareWinner")
MayorElection <- slide(MayorElection, Var = "Geschlecht1", TimeVar = "ElectionDate", NewVar = "L.Geschlecht1")
``

Once again, thanks for the feedback. Your expertise with the data set is really showing here.

jmarkgraf commented 8 years ago

Very quickly: Here is the data code that I used for the term number of mayors. It takes quite some time though to run:

## : Number of Terms of Mayors
MayorElection$FakeVar <- 1
MayorElection <- CountSpell(MayorElection, TimeVar = "ElectionDate", 
                            SpellVar = "FakeVar", GroupVar = "NameCandidate1", 
                            NewVar = "TermMayor", SpellValue = 1)