mcaceresb / stata-gtools

Faster implementation of Stata's collapse, reshape, xtile, egen, isid, and more using C plugins
https://gtools.readthedocs.io
MIT License
182 stars 38 forks source link

Extended missing values are not preserved #18

Closed mcaceresb closed 7 years ago

mcaceresb commented 7 years ago

Min, max, first, last, firstnm, lastnm all preserve Stata's extended missing values. However, gcollapse treats them all as missing.

. sysuse auto, clear
(1978 Automobile Data)

. replace price = .a
(74 real changes made, 74 to missing)

. gcollapse (first) price, by(foreign)

. l

     +------------------+
     |  foreign   price |
     |------------------|
  1. | Domestic       . |
  2. |  Foreign       . |
     +------------------+

However, collapse gives


     +------------------+
     |  foreign   price |
     |------------------|
  1. | Domestic      .a |
  2. |  Foreign      .a |
     +------------------+

Further, extended values are not correctly parsed by glevelsof. Consider:

clear
set obs 5
gen x = _n
replace x = .  in 2
replace x = .a in 3
replace x = .b in 4
glevelsof x

While "." is excluded, both ".a" and ".b" appear via their internal representation (rather than ".a" and ".b").