Algorithm not working on example separation datasets

Hi to all,

I was implementing the separation algorithm myself and I was testing the example datasets. I just followed the example code in https://github.com/sergiocorreia/ppmlhdfe/blob/master/guides/separation_primer.md and found differences in the results. I checked the example datasets and there were differences between what the example datasets say is separated and the output of the algorithm. Please see these two examples (3 and 4):

import delimited https://raw.githubusercontent.com/sergiocorreia/ppmlhdfe/master/test/separation_datasets/03.csv, clear

* Run IR (iterative rectifier) algorithm
loc tol = 1e-5
gen u =  !y
su u, mean
loc K = ceil(r(sum) / `tol' ^ 2)
gen w = cond(y, `K', 1) 

while 1 {
    qui reghdfe u [fw=w], absorb(id1 id2 id3) resid(e)
    predict double xb, xbd
    qui replace xb = 0 if abs(xb) < `tol'

    * Stop once all predicted values become non-negative
    qui cou if xb < 0
    if !r(N) {
        continue, break
    }

    replace u = max(xb, 0)
    drop xb w
}

rename xb z
gen is_sep = z > 0
list
assert separated == is_sep

(1 contradictions)

import delimited https://raw.githubusercontent.com/sergiocorreia/ppmlhdfe/master/test/separation_datasets/04.csv, clear

* Run IR (iterative rectifier) algorithm
loc tol = 1e-5
gen u =  !y
su u, mean
loc K = ceil(r(sum) / `tol' ^ 2)
gen w = cond(y, `K', 1) 

while 1 {
    qui reghdfe u [fw=w], absorb(id1 id2) resid(e)
    predict double xb, xbd
    qui replace xb = 0 if abs(xb) < `tol'

    * Stop once all predicted values become non-negative
    qui cou if xb < 0
    if !r(N) {
        continue, break
    }

    replace u = max(xb, 0)
    drop xb w
}

rename xb z
gen is_sep = z > 0
list
assert separated == is_sep

(2 contradictions)

Can you please tell me if 1) there is something more to the algorithm not captured in the example code provided, and having that would flag those observations differently; 2) or whether there is something wrong in the example datasets; 3) or those observations are flagged differently by one of the other methods and if so, how to interpret that?

Thanks again for this package. It's great!

Luís

sergiocorreia / ppmlhdfe

Algorithm not working on example separation datasets #6