svmiller / peacesciencer

Tools and Data for Quantitative Peace Science
http://svmiller.com/peacesciencer
GNU General Public License v2.0
25 stars 3 forks source link

directed dyad alliances #13

Open kevingalambos opened 1 year ago

kevingalambos commented 1 year ago

Hi Steve,

This package rules; it saves me countless hours. I came across an issue today in creating directed dyad years with alliances. I don't know the issue source, but basically alliances are only present in one direction. For example:

dy <- create_dyadyears(subset_years = 2005:2010, directed = T) %>% add_cow_alliance() %>% add_atop_alliance() dy %>% filter(ccode1 == 2 & ccode2 == 740) # USA and Japan are allies dy %>% filter(ccode1 == 740 & ccode2 == 2) # but Japan and USA not allies dy1 <- create_dyadyears(subset_years = 2005:2010, directed = F) %>% add_cow_alliance() %>% add_atop_alliance() dy1 %>% filter(ccode1 == 2 & ccode2 == 740) # allies in undirected dyad

Hope this is a quick fix, or even better, user error.

Kevin

svmiller commented 1 year ago

Ooof, yeah, this is a quick fix. I'm laying low with a COVID booster right now so I don't know if I'll get to it today, but thanks for bringing this to my attention.

svmiller commented 1 year ago

Btw, I think I'm seeing what might have happened on the CoW front. {peacesciencer} is working fine, but I think the the raw directed dyad-year data set for alliances is the culprit here. Observe:

read_csv("~/Dropbox/data/cow/alliance/4.1/alliance_v4.1_by_directed_yearly.csv") %>% 
    select(ccode1, ccode2, year, left_censor, right_censor, defense:entente)  %>%  filter(ccode1 == 740 & ccode2 == 2) %>% tail

Rows: 148258 Columns: 19                                                                                                                         
── Column specification ──────────────────────────────────────────────────────────────────────────────────────
Delimiter: ","
chr  (2): state_name1, state_name2
dbl (17): version4id, ccode1, ccode2, dyad_st_day, dyad_st_month, dyad_st_year, dyad_end_day, dyad_end_mon...

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
# A tibble: 6 × 9
  ccode1 ccode2  year left_censor right_censor defense neutrality nonaggression entente
   <dbl>  <dbl> <dbl>       <dbl>        <dbl>   <dbl>      <dbl>         <dbl>   <dbl>
1    740      2  2007           0            1       0          0             0       0
2    740      2  2008           0            1       0          0             0       0
3    740      2  2009           0            1       0          0             0       0
4    740      2  2010           0            1       0          0             0       0
5    740      2  2011           0            1       0          0             0       0
6    740      2  2012           0            1       0          0             0       0

Basically, I don't think directed dyad-year CoW alliance data are truly directed. I'll probably end up "un-directing" them and re-directing them.

kevingalambos commented 1 year ago

Just checked and the same problem exists in the ATOP source file. I wonder how these data are generated, and how many datasets have the same issue. Seems like a good task for a grad student to tackle. Oh, wait...

Happy booster recovery.

svmiller commented 1 year ago

So yeah, that's good that it's not a {peacesciencer} issue, per se. I think the easiest path forward would be taking a non-directed version of the alliance data, directing it, and rolling with that.

svmiller commented 1 year ago

Fixed the CoW alliance one, I believe. Unrelated to this, I'll have to fix add_cow_alliance() too because I need to hard-cut the temporal bound there to be 2012. Lazy programming on my part.

svmiller commented 1 year ago

FWIW, I don't see the issue in the ATOP version. I think this was just a CoW bug.

jandresgannon commented 1 month ago

FWIW, I don't see the issue in the ATOP version. I think this was just a CoW bug.

I'm seeing a similar issue with ATOP in {peacesciencer}.

MWE for US-Japan dyad:

df |>
  create_dyadyears(system = "cow",
                         subset_years = c(1970:2014),
                         directed = TRUE) |>
  dplyr::filter(ccode1 %in% c(2, 740) & ccode2 %in% c(2, 740)) |>
  add_atop_alliance()

That shows the US having a defense pact with Japan but not Japan having a defense pact with the US.

I think the issue may have to do with which version of ATOP dyad is being used. From the ATOP 5 codebook:

The dyad-year dataset (atop5_0dy) includes information about the commitments shared by a pair of states in a given year; we describe these data in section 2.6. The directed dyad-year dataset (atop5_0ddyr) provides information about the commitments made by one state to a specific dyadic partner in a given year. When alliance obligations are asymmetric, the dyad-year and directed dyad-year data differ.

If I'm remembering correctly from ATOP (which I may not be), in one sense the existing peacesciencer is correct since the US has promised to come to Japan's defense if Japan is attacked, but Japan has not promised to come to the US' defense if the US is attacked. It's a defense pact, but not a mutual defense pact.

But if you're using this function to do something like code a country's threat environment (Poast 2019, etc), the resulting df would not identify Japan as part of the US threat environment because atop_defense == 1 for the US-JPN dyad but it would identify the US as part of the Japan threat environment because atop_defense == 0 for the JPN-US dyad

I think the solution is to clarify whether the atop.dta used for the dyad_year df is the directed atop or the undirected atop. I can't think of a theoretical reason to prefer one as opposed to the other, both are useful in different instances, so maybe there's just some way to add a column in the merge that is atop_defense_mutual == 1 if ccode1-ccode2 atop_defense == 1 and ccode2-ccode1 atop_defense == 1

jandresgannon commented 1 month ago

My temporary hacky code to address this, in case its helpful:

create_dyadyears(system = "cow",
                         subset_years = c(1970:2014),
                         directed = TRUE) |>
  add_atop_alliance() |>
  dplyr::select(year, ccode1, ccode2, atop_defense) |>
  dplyr::group_by(year, dyad = pmin(ccode1, ccode2), dyad_rev = pmax(ccode1, ccode2)) |>
  dplyr::mutate(atop_defense_sum = sum(atop_defense)) |>
  dplyr::ungroup() |>
  dplyr::mutate(atop_defense_type = dplyr::case_when(atop_defense_sum == 2 ~ "symmetric",
                                                     atop_defense_sum == 1 ~ "asymmetric",
                                                     atop_defense_sum == 0 ~ "none")) |>
  select(-dyad, -dyad_rev, -atop_defense_sum)
svmiller commented 1 month ago

Ah, I can see that now. Reopening this.