sergiocorreia / reghdfe

Linear, IV and GMM Regressions With Any Number of Fixed Effects
http://scorreia.com/software/reghdfe/
MIT License
214 stars 56 forks source link

[BUG] parallel option #236

Closed reifjulian closed 1 year ago

reifjulian commented 2 years ago

I am running Stata version 17.0, and the latest versions of reghdfe/ftools/parallel:

reghdfe
*! version 6.12.1 27June2021

ftools
*! version 2.48.0 29mar2021

parallel
*! version 1.20.0 19mar2019
*! PARALLEL: Stata module for parallel computing
*! by George G. Vega [cre,aut], Brian Quistorff [aut]

Bug report

I am unable to get the parallel option to work. Here is a minimum working example:

sysuse auto, clear
reghdfe price mpg, absorb(foreign) parallel(2)

Here are the errors, which are different depending on the OS.

* Windows error:
<istmt>:  3499  unlink_folder() not found
unrecognized command
unrecognized command

* ---------------
* Unix error:
 - Task 1 failed (error code 111)
 - Task 2 failed (error code 111)
invalid syntax
invalid syntax
r(111);

* The lines creating the r(111) on Unix are:
* cap noi {
*.         reghdfe, worker parallel_path("/tmp/PARALLEL_286580155")
*tmp not found
*r(111);
*.         }

* macos error
Unsupported method: macosx-shell
system limit exceeded - see manual
system limit exceeded - see manual
r(1000);

Two notes of possible interest:

  1. reghdfe runs fine if the parallel(2) option is removed
  2. All packages are installed in a local subfolder, as in this example. Perhaps this causes an issue with the parallel()?
AlejandroRodino commented 1 year ago

Did you find a solution for this?

reifjulian commented 1 year ago

@AlejandroRodino I did not find a solution, although I have also not tried it again since I posted the issue.

gen-li commented 1 year ago

I also have the same problem

sergiocorreia commented 1 year ago

I'm a bit confused about the error; here is some speculation:

  1. I'm able to run it fine on Stata/MP 17.0 on Windows. Note however that I'm using reghdfe 6.12.2 , which did some changes related to the parallel code: (see here: https://github.com/sergiocorreia/reghdfe/issues/239 )
  2. I have no access to OSX so can't tell if the problem is related to parallel or to reghdfe.

To try to pin down the problem, what OS are you running Gen?

reifjulian commented 1 year ago

I believe the error is related to using a local package installation. I can replicate the error on Windows by doing the following. First, uninstall ftools, reghdfe, and parallel. Then install them in a local folder:

* Install packages
cap mkdir "libraries"
cap mkdir "libraries/stata"
net set ado "libraries/stata"

local github "https://raw.githubusercontent.com"
net install ftools, from("`github'/sergiocorreia/ftools/master/src/") replace
net install reghdfe, from("`github'/sergiocorreia/reghdfe/master/src/") replace

net install parallel, from("https://raw.github.com/gvegayon/parallel/stable/") replace
mata mata mlib index

Close and reopen Stata. Then run:

tokenize `"$S_ADO"', parse(";")
while `"`1'"' != "" {
  if `"`1'"'!="BASE" cap adopath - `"`1'"'
  macro shift
}
adopath ++ "libraries/stata"

sysuse auto, clear

* Generates error
reghdfe price mpg, absorb(foreign) parallel(2)

* No error
reghdfe price mpg, absorb(foreign)

Tracing through the code, the error appears to come from this code in parallel:

= cap noi parallel_map_inner , val(1/2)  nologtable maxproc(2) id(940640317) tmp_path("C:\Users\jreif\AppData\Local\Temp/")  : reghdfe, worker parallel_path("C:\Users\jreif\AppData\Local\Temp/PARALLEL_940640317")
 - Task 1 failed (error code 199)
 - Task 2 failed (error code 199)
unrecognized command
gen-li commented 1 year ago

I believe the error is related to using a local package installation. I can replicate the error on Windows by doing the following. First, uninstall ftools, reghdfe, and parallel. Then install them in a local folder:

* Install packages
cap mkdir "libraries"
cap mkdir "libraries/stata"
net set ado "libraries/stata"

local github "https://raw.githubusercontent.com"
net install ftools, from("`github'/sergiocorreia/ftools/master/src/") replace
net install reghdfe, from("`github'/sergiocorreia/reghdfe/master/src/") replace

net install parallel, from("https://raw.github.com/gvegayon/parallel/stable/") replace
mata mata mlib index

Close and reopen Stata. Then run:

tokenize `"$S_ADO"', parse(";")
while `"`1'"' != "" {
  if `"`1'"'!="BASE" cap adopath - `"`1'"'
  macro shift
}
adopath ++ "libraries/stata"

sysuse auto, clear

* Generates error
reghdfe price mpg, absorb(foreign) parallel(2)

* No error
reghdfe price mpg, absorb(foreign)

Tracing through the code, the error appears to come from this code in parallel:

= cap noi parallel_map_inner , val(1/2)  nologtable maxproc(2) id(940640317) tmp_path("C:\Users\jreif\AppData\Local\Temp/")  : reghdfe, worker parallel_path("C:\Users\jreif\AppData\Local\Temp/PARALLEL_940640317")
 - Task 1 failed (error code 199)
 - Task 2 failed (error code 199)
unrecognized command

I still got the same error.

"Unsupported method: macosx-shell system limit exceeded - see manual system limit exceeded - see manual r(1000); "

gen-li commented 1 year ago

I'm a bit confused about the error; here is some speculation:

  1. I'm able to run it fine on Stata/MP 17.0 on Windows. Note however that I'm using reghdfe 6.12.2 , which did some changes related to the parallel code: (see here: [BUG] reghdfe accidently deletes files from hard drive! #239 )
  2. I have no access to OSX so can't tell if the problem is related to parallel or to reghdfe.

To try to pin down the problem, what OS are you running Gen?

Hi sergio,

Thanks for your reply! I am using MacOS 13.1 (22C65)

caustindavis commented 1 year ago

I am also having this issue.

macOS 13.2.1 Stata 17.0 (updated 08 Mar 2023)

. which reghdfe
/Users/cadavis/Library/Application Support/Stata/ado/plus/r/reghdfe.ado
*! version 6.12.2 02Nov2021

. which parallel
/Users/cadavis/Library/Application Support/Stata/ado/plus/p/parallel.ado
*! version 1.20.0 19mar2019

. which ftools
/Users/cadavis/Library/Application Support/Stata/ado/plus/f/ftools.ado
*! version 2.49.0 06may2022

The MWE provided by @reifjulian produces the same error:

. sysuse auto, clear
(1978 automobile data)

. reghdfe price mpg, absorb(foreign) parallel(2)
Unsupported method: macosx-shell
system limit exceeded - see manual
system limit exceeded - see manual
r(1000);

end of do-file

r(1000);

Any help would be much appreciated!

sergiocorreia commented 1 year ago

Thanks for the ideas.

A few ways to get more info:

1) Run parallel suboption with verbose option:

reghdfe price mpg, absorb(foreign) parallel(2, verbose)

2) Just run the internal parallel_map program:

parallel_map, values(1 2 3): di "$task_id"

And see if you can click on "type log" (of the first one) and then paste its output

3) Run reghdfe with verbose option

reghdfe price mpg, absorb(foreign) parallel(2) verbose(2)

4) Run reghdfe with trace set to 2

set trace on
set tracedepth 2
reghdfe price mpg, absorb(foreign) parallel(2)

Could you paste the output of the four options above, to see what's going on? (It looks different than the earlier problem of the adopath being located in a different place (which we could solve by adding an option to parallel map (help parallel_map with the adopath).

caustindavis commented 1 year ago

Wow! Thanks so much for the quick reply! Output below from each of the 4 snippets.

  1. Run parallel suboption with verbose option:
    
    . reghdfe price mpg, absorb(foreign) parallel(2, verbose)

Parallel information:

Starting task 1 Unsupported method: macosx-shell system limit exceeded - see manual system limit exceeded - see manual r(1000);

2. Just run the internal parallel_map program:
```stata
. parallel_map, values(1 2 3): di "$task_id"
                 <istmt>:  3499  unlink_folder() not found
Unsupported method: macosx-shell
system limit exceeded - see manual
system limit exceeded - see manual
  1. Run reghdfe with verbose options
    
    . reghdfe price mpg, absorb(foreign) parallel(2) verbose(2)

[CMD] reghdfe price mpg, absorb(foreign) parallel(2) verbose(2) Parsing and validating options:

Parsing varlist: price mpg

macros: r(basevars) : "price mpg" r(indepvars) : "mpg" r(fe_format) : "%8.0gc" r(depvar) : "price"

Parsing vce()

macros: s(num_clusters) : "0" s(vcetype) : "unadjusted"

Parsing dof()

macros: s(dofadjustments) : "pairwise clusters continuous" s(num_clusters) : "0" s(vcetype) : "unadjusted"

Parsing parallel options: 2

macros: s(parallel_opts) : "maxproc(2) id(376780138) tmp_path("/var/folders/8t/wp0xr4654bx1c01fvb70fgz40000gn/T/") " s(parallel_dir) : "/var/folders/8t/wp0xr4654bx1c01fvb70fgz40000gn/T/PARALLEL_376780138" s(parallel_maxproc) : "2" s(parallel_force) : "0"

Passing main options to Mata

- HDFE.absvars = `"foreign"' 
- HDFE.tousevar = `"__000000"' 
- HDFE.weight_type = `""' 
- HDFE.weight_var = `""' 
- HDFE.technique = `"map"' 
- HDFE.transform = `"symmetric_kaczmarz"' 
- HDFE.acceleration = `"conjugate_gradient"' 
- HDFE.preconditioner = `"block_diagonal"' 
- HDFE.parallel_dir = `"/var/folders/8t/wp0xr4654bx1c01fvb70fgz40000gn/T/PARALLEL_376780138"' 
- HDFE.parallel_opts = `"maxproc(2) id(376780138) tmp_path("/var/folders/8t/wp0xr4654bx1c01fvb70fgz40000gn/T/")  "' 
- HDFE.drop_singletons = 1
- HDFE.tolerance = 1.00000000000e-08
- HDFE.maxiter = 16000
- HDFE.compact = 0
- HDFE.poolsize = 10
- HDFE.verbose = 2
- HDFE.parallel_maxproc = 2
- HDFE.parallel_force = 0
- HDFE.timeit = 0

Parsing absorb(foreign) and initializing FixedEffects() object

macros: s(G) : "1" s(has_intercept) : "1" s(save_any_fe) : "0" s(save_all_fe) : "0" s(absvars) : " "foreign"" s(basevars) : "foreign" s(ivars) : " "foreign"" s(cvars) : " """ s(targets) : " """ s(intercepts) : " 1" s(num_slopes) : " 0" s(extended_absvars) : "1.foreign"

Loading fixed effects information:

Initializing Mata object for 1 fixed effect

+-----------------------------------------------------------------------------------------------+ | i | g | Name | Int? | #Slopes | Obs. | Levels | Sorted? | Indiv? | #Drop Singl. | |----+---+----------+------+---------+-----------+------------+---------+--------+--------------| | 1 | 1 | foreign | Yes | 0 | 74 | 2 | Yes | No | 0 | +-----------------------------------------------------------------------------------------------+

Initializing panelsetup() and loading slope variables for each FE

Estimating degrees-of-freedom absorbed by the fixed effects

Working on varlist: partialling out and regression

Parsing and expanding indepvars: mpg

macros: r(not_omitted) : "1" r(varlist) : "mpg" r(fullvarlist_bn) : "mpg" r(fullvarlist) : "mpg"

Loading and partialling out 2 variables in a single block

Cleaning up the HDFE object so it can be saved/loaded from disk

[Parallel] Loading and partialling 2 variables using up to 2 worker processes

          Each process will work in blocks of 0-1 variables
          Temporary files will be saved in /var/folders/8t/wp0xr4654bx1c01fvb70fgz40000gn/T/PARALLEL_376780138
          - HDFE object saved in /var/folders/8t/wp0xr4654bx1c01fvb70fgz40000gn/T/PARALLEL_376780138/data0.tmp
          - Data block #1 with 1 cols saved in /var/folders/8t/wp0xr4654bx1c01fvb70fgz40000gn/T/PARALLEL_376780138/data1.t

mp

  • Data block #2 with 1 cols saved in /var/folders/8t/wp0xr4654bx1c01fvb70fgz40000gn/T/PARALLEL_376780138/data2.t mp

Starting parallel processes:

command: parallel_map, val(1/2) verbose maxproc(2) id(376780138) tmp_path("/var/folders/8t/wp0xr4654bx1c01fvb70fgz40000gn/T/"

) : reghdfe, worker parallel_path("/var/folders/8t/wp0xr4654bx1c01fvb70fgz40000gn/T/PARALLEL_376780138")

Parallel information:

Starting task 1 Unsupported method: macosx-shell system limit exceeded - see manual system limit exceeded - see manual r(1000);

4. Run reghdfe with trace set to 2
```stata
. set trace on

. set tracedepth 2

. reghdfe price mpg, absorb(foreign) parallel(2)
------------------------------------------------------------------------------------------------------------ begin reghdfe ---
- cap syntax, store_alphas
- if (!c(rc)) {
  Store_Alphas
  exit
  }
- cap syntax, shrug
- if (!c(rc)) {
  di as text _n `"    {browse "https://www.theawl.com/2014/05/the-life-and-times-of-%C2%AF_%E3%83%84_%C2%AF/":¯\_(ツ)_/¯}"'
  exit
  }
- cap syntax, worker [*]
- if (!c(rc)) {
  ParallelWorker, `options'
  exit
  }
- cap syntax anything(everything) [fw aw pw/], [*] VERSION(integer) [noWARN]
- if !c(rc) {
  _assert inlist(`version', 3, 5)
  if ("`warn'" != "nowarn") di as error "(running historical version of reghdfe: `version')"
  if ("`weight'"!="") local weightexp [`weight'=`exp']
  if (`version' == 3) {
  reghdfe3 `anything' `weightexp', `options'
  }
  else {
  reghdfe5 `anything' `weightexp', `options'
  }
  exit
  }
- if replay() {
  Replay `0'
  exit
  }
- loc keep_mata 0
- Cleanup 0 `keep_mata'
= Cleanup 0 0
  -------------------------------------------------------------------------------------------------- begin reghdfe.Cleanup ---
  - args rc keep_mata
  - loc cleanup_folder = !`keep_mata' & ("$LAST_PARALLEL_DIR"!="")
  = loc cleanup_folder = !0 & (""!="")
  - if (`cleanup_folder') cap mata: unlink_folder(HDFE.parallel_dir, 0)
  = if (0) cap mata: unlink_folder(HDFE.parallel_dir, 0)
  - global LAST_PARALLEL_DIR
  - global pids
  - if (!`keep_mata') cap mata: mata drop HDFE
  = if (!0) cap mata: mata drop HDFE
  - cap mata: mata drop hdfe_*
  - cap drop __temp_reghdfe_resid__
  - if (`rc') exit `rc'
  = if (0) exit 0
  ---------------------------------------------------------------------------------------------------- end reghdfe.Cleanup ---
- qui which ftools
- ms_get_version ftools, min_version("2.46.0")
  --------------------------------------------------------------------------------------------------- begin ms_get_version ---
  - syntax anything(name=ado), [min_version(string) min_date(string)]
  - mata: st_local("package_version", get_version("`ado'"))
  = mata: st_local("package_version", get_version("ftools"))
  - c_local package_version "`package_version'"
  = c_local package_version "2.49.0 06may2022"
  - loc _ `package_version'
  = loc _ 2.49.0 06may2022
  - gettoken version_number _ : _
  - gettoken version_date _ : _
  - c_local version_number "`version_number'"
  = c_local version_number "2.49.0"
  - c_local version_date "`version_date'"
  = c_local version_date "06may2022"
  - if ("`min_version'" != "") {
  = if ("2.46.0" != "") {
  - loc ok 0
  - cap mata: st_local("ok", strofreal(strtoreal(tokens(subinstr("`version_number'", ".", " "))) * (1e5, 1e3, 1)' >= strtoreal
> (tokens(subinstr("`min_version'", ".", " "))) * (1e5, 1e3, 1)'))
  = cap mata: st_local("ok", strofreal(strtoreal(tokens(subinstr("2.49.0", ".", " "))) * (1e5, 1e3, 1)' >= strtoreal(tokens(su
> binstr("2.46.0", ".", " "))) * (1e5, 1e3, 1)'))
  - _assert `ok', msg("you are using version `version_number' of `ado', but require version `min_version'")
  = _assert 1, msg("you are using version 2.49.0 of ftools, but require version 2.46.0")
  - }
  - if ("`min_date'" != "") {
  = if ("" != "") {
    loc ok = !mi(date("`version_date'", "DMY")) & (date("`version_date'", "DMY") >= date("`min_date'", "DMY"))
    _assert `ok', msg("you are using `ado' from `version_date', but require a version from at `min_date' or later")
    }
  ----------------------------------------------------------------------------------------------------- end ms_get_version ---
- cap noi Estimate `0'
= cap noi Estimate price mpg, absorb(foreign) parallel(2)
  ------------------------------------------------------------------------------------------------- begin reghdfe.Estimate ---
  - syntax varlist(fv ts numeric) [if] [in] [fw aw pw/] [ , Absorb(string) Group_id(varname numeric) Individual_id(varname num
> eric) AGgregation(string) VCE(string) CLuster(string) RESiduals(name) RESiduals2 DOFadjustments(string) GROUPVar(name) TEChn
> ique(string) TOLerance(real 1e-8) ITERATE(real 16000) TRAnsform(string) ACCELeration(string) PREConditioner(string) PRUNE NO
> SAMPle COMPACT POOLsize(integer 10) PARallel(string asis) noHEader noTABle noFOOTnote Verbose(integer 0) noWARN TIMEit KEEPS
> INgletons noPARTIALout varlist_is_touse noREGress KEEPMATA FASTREGress noCONstant noAbsorb2 ] [*]
  - loc timeit = ("`timeit'"!="")
  = loc timeit = (""!="")
  - if (`timeit') timer on 20
  = if (0) timer on 20
  - if (`verbose' >= 2) di _n `"{txt}{bf:[CMD]} {inp}reghdfe `0'"'
  = if (0 >= 2) di _n `"{txt}{bf:[CMD]} {inp}reghdfe price mpg, absorb(foreign) parallel(2)"'
  - cap drop __hdfe*
  - if (`verbose' > 0) di as text "{title:Parsing and validating options:}" _n
  = if (0 > 0) di as text "{title:Parsing and validating options:}" _n
  - _get_diopts diopts options, `options'
  = _get_diopts diopts options, 
  - loc drop_singletons = ("`keepsingletons'" == "")
  = loc drop_singletons = ("" == "")
  - loc compact = ("`compact'" != "")
  = loc compact = ("" != "")
  - loc has_standard_fe = (`"`absorb'"' != "")
  = loc has_standard_fe = (`"foreign"' != "")
  - loc report_constant = "`constant'" != "noconstant"
  = loc report_constant = "" != "noconstant"
  - loc has_teams = (`"`group_id'"' != "")
  = loc has_teams = (`""' != "")
  - loc has_individual_fe = (`"`individual_id'"' != "")
  = loc has_individual_fe = (`""' != "")
  - loc stop_before_partial_out = ("`partialout'" == "nopartialout")
  = loc stop_before_partial_out = ("" == "nopartialout")
  - loc stop_before_regression = ("`regress'" == "noregress")
  = loc stop_before_regression = ("" == "noregress")
  - loc fast_regression = ("`fastregress'" == "fastregress")
  = loc fast_regression = ("" == "fastregress")
  - if (`has_individual_fe') _assert `has_teams', msg("cannot set the individual() identifiers without the group() identifiers
> ") rc(198)
  = if (0) _assert 0, msg("cannot set the individual() identifiers without the group() identifiers") rc(198)
  - if ("`technique'" == "") loc technique = cond("`individual_id'"=="", "map", "lsmr")
  = if ("" == "") loc technique = cond(""=="", "map", "lsmr")
  - if ("`transform'" == "") loc transform "symmetric_kaczmarz"
  = if ("" == "") loc transform "symmetric_kaczmarz"
  - if ("`acceleration'" == "") loc acceleration "conjugate_gradient"
  = if ("" == "") loc acceleration "conjugate_gradient"
  - if ("`preconditioner'" == "") loc preconditioner "block_diagonal"
  = if ("" == "") loc preconditioner "block_diagonal"
  - if (`poolsize' == 0) loc poolsize = .
  = if (10 == 0) loc poolsize = .
  - if (`verbose'>-1 & "`keepsingletons'"!="" & "`warn'" != "nowarn") {
  = if (0>-1 & ""!="" & "" != "nowarn") {
    loc url "http://scorreia.com/reghdfe/nested_within_cluster.pdf"
    loc msg "WARNING: Singleton observations not dropped; statistical significance is biased"
    di as error `"`msg' {browse "`url'":(link)}"'
    }
  - if ("`cluster'"!="") {
  = if (""!="") {
    _assert ("`vce'"==""), msg("only one of cluster() and vce() can be specified") rc(198)
    loc vce cluster `cluster'
    }
  - if ("`aggregation'" == "") loc aggregation mean
  = if ("" == "") loc aggregation mean
  - if ("`aggregation'" == "average" | "`aggregation'" == "avg") loc aggregation mean
  = if ("mean" == "average" | "mean" == "avg") loc aggregation mean
  - _assert inlist("`aggregation'", "mean", "sum")
  = _assert inlist("mean", "mean", "sum")
  - loc function_individual "`aggregation'"
  = loc function_individual "mean"
  - if (`verbose' > 0) di as text "# Parsing varlist: {res}`varlist'" _c
  = if (0 > 0) di as text "# Parsing varlist: {res}price mpg" _c
  - ms_parse_varlist `varlist'
  = ms_parse_varlist price mpg
  - if (`verbose' > 0) return list
  = if (0 > 0) return list
  - loc depvar `r(depvar)'
  = loc depvar price
  - loc indepvars `r(indepvars)'
  = loc indepvars mpg
  - loc fe_format "`r(fe_format)'"
  = loc fe_format "%8.0gc"
  - loc basevars `r(basevars)'
  = loc basevars price mpg
  - if ("`weight'"!="") unab exp : `exp', min(1) max(1)
  = if (""!="") unab exp : , min(1) max(1)
  - if (`verbose' > 0) di as text _n "# Parsing vce({res}`vce'{txt})" _c
  = if (0 > 0) di as text _n "# Parsing vce({res}{txt})" _c
  - ms_parse_vce, vce(`vce') weighttype(`weight')
  = ms_parse_vce, vce() weighttype()
  - if (`verbose' > 0) sreturn list
  = if (0 > 0) sreturn list
  - loc vcetype `s(vcetype)'
  = loc vcetype unadjusted
  - loc clustervars `s(clustervars)'
  = loc clustervars 
  - loc base_clustervars `s(base_clustervars)'
  = loc base_clustervars 
  - loc num_clusters = `s(num_clusters)'
  = loc num_clusters = 0
  - confirm variable `base_clustervars', exact
  = confirm variable , exact
  - if (`stop_before_partial_out' & "`varlist_is_touse'" != "") {
  = if (0 & "" != "") {
    loc touse `varlist'
    loc varlist
    markout `touse' `base_clustervars' `group_id' `individual_id'
    }
  - else {
  - loc varlist `depvar' `indepvars' `base_clustervars' `group_id' `individual_id'
  = loc varlist price mpg   
  - marksample touse, strok
  - la var `touse' "[touse]"
  = la var __000000 "[touse]"
  - }
  - if (`stop_before_partial_out') loc varlist
  = if (0) loc varlist
  - loc valid_techniques map cg lsmr lsqr
  - _assert (`: list technique in valid_techniques'), msg("invalid technique: `technique'")
  = _assert (1), msg("invalid technique: map")
  - loc transform = lower("`transform'")
  = loc transform = lower("symmetric_kaczmarz")
  - loc valid_transforms cimmino kaczmarz symmetric_kaczmarz rand_kaczmarz
  - foreach x of local valid_transforms {
  - if (strpos("`x'", "`transform'")==1) loc transform `x'
  = if (strpos("cimmino", "symmetric_kaczmarz")==1) loc transform cimmino
  - }
  - if (strpos("`x'", "`transform'")==1) loc transform `x'
  = if (strpos("kaczmarz", "symmetric_kaczmarz")==1) loc transform kaczmarz
  - }
  - if (strpos("`x'", "`transform'")==1) loc transform `x'
  = if (strpos("symmetric_kaczmarz", "symmetric_kaczmarz")==1) loc transform symmetric_kaczmarz
  - }
  - if (strpos("`x'", "`transform'")==1) loc transform `x'
  = if (strpos("rand_kaczmarz", "symmetric_kaczmarz")==1) loc transform rand_kaczmarz
  - }
  - _assert (`: list transform in valid_transforms'), msg("invalid transform: `transform'")
  = _assert (1), msg("invalid transform: symmetric_kaczmarz")
  - loc acceleration = lower("`acceleration'")
  = loc acceleration = lower("conjugate_gradient")
  - if ("`acceleration'"=="cg") loc acceleration conjugate_gradient
  = if ("conjugate_gradient"=="cg") loc acceleration conjugate_gradient
  - if ("`acceleration'"=="sd") loc acceleration steepest_descent
  = if ("conjugate_gradient"=="sd") loc acceleration steepest_descent
  - if ("`acceleration'"=="off") loc acceleration none
  = if ("conjugate_gradient"=="off") loc acceleration none
  - loc valid_accelerations conjugate_gradient steepest_descent aitken none
  - foreach x of local valid_accelerations {
  - if (strpos("`x'", "`acceleration'")==1) loc acceleration `x'
  = if (strpos("conjugate_gradient", "conjugate_gradient")==1) loc acceleration conjugate_gradient
  - }
  - if (strpos("`x'", "`acceleration'")==1) loc acceleration `x'
  = if (strpos("steepest_descent", "conjugate_gradient")==1) loc acceleration steepest_descent
  - }
  - if (strpos("`x'", "`acceleration'")==1) loc acceleration `x'
  = if (strpos("aitken", "conjugate_gradient")==1) loc acceleration aitken
  - }
  - if (strpos("`x'", "`acceleration'")==1) loc acceleration `x'
  = if (strpos("none", "conjugate_gradient")==1) loc acceleration none
  - }
  - _assert (`: list acceleration in valid_accelerations'), msg("invalid acceleration: `acceleration'")
  = _assert (1), msg("invalid acceleration: conjugate_gradient")
  - loc preconditioner = lower("`preconditioner'")
  = loc preconditioner = lower("block_diagonal")
  - if ("`preconditioner'"=="off") loc preconditioner none
  = if ("block_diagonal"=="off") loc preconditioner none
  - loc valid_preconditioners none diagonal block_diagonal
  - foreach x of local valid_preconditioners {
  - if (strpos("`x'", "`preconditioner'")==1) loc preconditioner `x'
  = if (strpos("none", "block_diagonal")==1) loc preconditioner none
  - }
  - if (strpos("`x'", "`preconditioner'")==1) loc preconditioner `x'
  = if (strpos("diagonal", "block_diagonal")==1) loc preconditioner diagonal
  - }
  - if (strpos("`x'", "`preconditioner'")==1) loc preconditioner `x'
  = if (strpos("block_diagonal", "block_diagonal")==1) loc preconditioner block_diagonal
  - }
  - _assert (`: list preconditioner in valid_preconditioners'), msg("invalid preconditioner: `preconditioner'")
  = _assert (1), msg("invalid preconditioner: block_diagonal")
  - if (`verbose' > 0) di as text _n `"# Parsing dof({res}`dofadjustments'{txt})"' _c
  = if (0 > 0) di as text _n `"# Parsing dof({res}{txt})"' _c
  - ParseDOF, `dofadjustments'
  = ParseDOF, 
  - loc dofadjustments `s(dofadjustments)'
  = loc dofadjustments pairwise clusters continuous
  - if (`verbose' > 0) sreturn list
  = if (0 > 0) sreturn list
  - opts_exclusive "`residuals' `residuals2'" residuals
  = opts_exclusive " " residuals
  - if ("`residuals2'" != "") {
  = if ("" != "") {
    cap drop _reghdfe_resid
    loc residuals _reghdfe_resid
    }
  - else if ("`residuals'"!="") {
  = else if (""!="") {
    conf new var `residuals'
    }
  - if (`"`parallel'"' != "") {
  = if (`"2"' != "") {
  - if (`verbose' > 0) di as text _n `"# Parsing parallel options: {inp}`parallel'"' _c
  = if (0 > 0) di as text _n `"# Parsing parallel options: {inp}2"' _c
  - ParseParallel `parallel'
  = ParseParallel 2
  - if (`verbose' > 0) sreturn list
  = if (0 > 0) sreturn list
  - loc parallel_maxproc `s(parallel_maxproc)'
  = loc parallel_maxproc 2
  - loc parallel_dir `"`s(parallel_dir)'"'
  = loc parallel_dir `"/var/folders/8t/wp0xr4654bx1c01fvb70fgz40000gn/T/PARALLEL_347266389"'
  - loc parallel_force `s(parallel_force)'
  = loc parallel_force 0
  - loc parallel_opts `"`s(parallel_opts)'"'
  = loc parallel_opts `"maxproc(2) id(347266389) tmp_path("/var/folders/8t/wp0xr4654bx1c01fvb70fgz40000gn/T/")  "'
  - }
  - else {
    loc parallel_maxproc 0
    loc parallel_force 0
    }
  - if (`has_teams') {
  = if (0) {
    tempvar indiv_tousevar
    ValidateGroups `basevars' `base_clustervars' `exp', group_id(`group_id') touse(`touse') indivtouse(`indiv_tousevar') indiv
> idual(`individual_id')
    _assert ("`weight_type'"=="fweight") + ("`indiv_tousevar'" != "") < 2, msg("fweights are incompatible with individual ids 
> as there cannot be two observations for a given group-individual touple")
    }
  - mata: HDFE = FixedEffects()
  - if (`verbose' > 0) di as text _n `"# Passing main options to Mata"' _n
  = if (0 > 0) di as text _n `"# Passing main options to Mata"' _n
  - loc absvars `"`absorb'"'
  = loc absvars `"foreign"'
  - loc tousevar `"`touse'"'
  = loc tousevar `"__000000"'
  - loc weight_type `"`weight'"'
  = loc weight_type `""'
  - loc weight_var `"`exp'"'
  = loc weight_var `""'
  - loc optim_options absvars tousevar weight_type weight_var technique transform acceleration preconditioner parallel_dir par
> allel_opts
  - if (`has_teams') loc optim_options `optim_options' group_id individual_id indiv_tousevar function_individual
  = if (0) loc optim_options absvars tousevar weight_type weight_var technique transform acceleration preconditioner parallel_
> dir parallel_opts group_id individual_id indiv_tousevar function_individual
  - foreach opt of local optim_options {
  - if (`verbose' > 0) di as text `"    - HDFE.`opt' = {res}`"``opt''"' "'
  = if (0 > 0) di as text `"    - HDFE.absvars = {res}`"foreign"' "'
  - mata: HDFE.`opt' = `"``opt''"'
  = mata: HDFE.absvars = `"foreign"'
  - }
  - if (`verbose' > 0) di as text `"    - HDFE.`opt' = {res}`"``opt''"' "'
  = if (0 > 0) di as text `"    - HDFE.tousevar = {res}`"__000000"' "'
  - mata: HDFE.`opt' = `"``opt''"'
  = mata: HDFE.tousevar = `"__000000"'
  - }
  - if (`verbose' > 0) di as text `"    - HDFE.`opt' = {res}`"``opt''"' "'
  = if (0 > 0) di as text `"    - HDFE.weight_type = {res}`""' "'
  - mata: HDFE.`opt' = `"``opt''"'
  = mata: HDFE.weight_type = `""'
  - }
  - if (`verbose' > 0) di as text `"    - HDFE.`opt' = {res}`"``opt''"' "'
  = if (0 > 0) di as text `"    - HDFE.weight_var = {res}`""' "'
  - mata: HDFE.`opt' = `"``opt''"'
  = mata: HDFE.weight_var = `""'
  - }
  - if (`verbose' > 0) di as text `"    - HDFE.`opt' = {res}`"``opt''"' "'
  = if (0 > 0) di as text `"    - HDFE.technique = {res}`"map"' "'
  - mata: HDFE.`opt' = `"``opt''"'
  = mata: HDFE.technique = `"map"'
  - }
  - if (`verbose' > 0) di as text `"    - HDFE.`opt' = {res}`"``opt''"' "'
  = if (0 > 0) di as text `"    - HDFE.transform = {res}`"symmetric_kaczmarz"' "'
  - mata: HDFE.`opt' = `"``opt''"'
  = mata: HDFE.transform = `"symmetric_kaczmarz"'
  - }
  - if (`verbose' > 0) di as text `"    - HDFE.`opt' = {res}`"``opt''"' "'
  = if (0 > 0) di as text `"    - HDFE.acceleration = {res}`"conjugate_gradient"' "'
  - mata: HDFE.`opt' = `"``opt''"'
  = mata: HDFE.acceleration = `"conjugate_gradient"'
  - }
  - if (`verbose' > 0) di as text `"    - HDFE.`opt' = {res}`"``opt''"' "'
  = if (0 > 0) di as text `"    - HDFE.preconditioner = {res}`"block_diagonal"' "'
  - mata: HDFE.`opt' = `"``opt''"'
  = mata: HDFE.preconditioner = `"block_diagonal"'
  - }
  - if (`verbose' > 0) di as text `"    - HDFE.`opt' = {res}`"``opt''"' "'
  = if (0 > 0) di as text `"    - HDFE.parallel_dir = {res}`"/var/folders/8t/wp0xr4654bx1c01fvb70fgz40000gn/T/PARALLEL_3472663
> 89"' "'
  - mata: HDFE.`opt' = `"``opt''"'
  = mata: HDFE.parallel_dir = `"/var/folders/8t/wp0xr4654bx1c01fvb70fgz40000gn/T/PARALLEL_347266389"'
  - }
  - if (`verbose' > 0) di as text `"    - HDFE.`opt' = {res}`"``opt''"' "'
  = if (0 > 0) di as text `"    - HDFE.parallel_opts = {res}`"maxproc(2) id(347266389) tmp_path("/var/folders/8t/wp0xr4654bx1c
> 01fvb70fgz40000gn/T/")  "' "'
  - mata: HDFE.`opt' = `"``opt''"'
  = mata: HDFE.parallel_opts = `"maxproc(2) id(347266389) tmp_path("/var/folders/8t/wp0xr4654bx1c01fvb70fgz40000gn/T/")  "'
  - }
  - loc maxiter = `iterate'
  = loc maxiter = 16000
  - loc optim_options drop_singletons tolerance maxiter compact poolsize verbose parallel_maxproc parallel_force timeit
  - foreach opt of local optim_options {
  - if (`verbose' > 0) di as text `"    - HDFE.`opt' = {res}``opt''"'
  = if (0 > 0) di as text `"    - HDFE.drop_singletons = {res}1"'
  - mata: HDFE.`opt' = ``opt''
  = mata: HDFE.drop_singletons = 1
  - }
  - if (`verbose' > 0) di as text `"    - HDFE.`opt' = {res}``opt''"'
  = if (0 > 0) di as text `"    - HDFE.tolerance = {res}1.00000000000e-08"'
  - mata: HDFE.`opt' = ``opt''
  = mata: HDFE.tolerance = 1.00000000000e-08
  - }
  - if (`verbose' > 0) di as text `"    - HDFE.`opt' = {res}``opt''"'
  = if (0 > 0) di as text `"    - HDFE.maxiter = {res}16000"'
  - mata: HDFE.`opt' = ``opt''
  = mata: HDFE.maxiter = 16000
  - }
  - if (`verbose' > 0) di as text `"    - HDFE.`opt' = {res}``opt''"'
  = if (0 > 0) di as text `"    - HDFE.compact = {res}0"'
  - mata: HDFE.`opt' = ``opt''
  = mata: HDFE.compact = 0
  - }
  - if (`verbose' > 0) di as text `"    - HDFE.`opt' = {res}``opt''"'
  = if (0 > 0) di as text `"    - HDFE.poolsize = {res}10"'
  - mata: HDFE.`opt' = ``opt''
  = mata: HDFE.poolsize = 10
  - }
  - if (`verbose' > 0) di as text `"    - HDFE.`opt' = {res}``opt''"'
  = if (0 > 0) di as text `"    - HDFE.verbose = {res}0"'
  - mata: HDFE.`opt' = ``opt''
  = mata: HDFE.verbose = 0
  - }
  - if (`verbose' > 0) di as text `"    - HDFE.`opt' = {res}``opt''"'
  = if (0 > 0) di as text `"    - HDFE.parallel_maxproc = {res}2"'
  - mata: HDFE.`opt' = ``opt''
  = mata: HDFE.parallel_maxproc = 2
  - }
  - if (`verbose' > 0) di as text `"    - HDFE.`opt' = {res}``opt''"'
  = if (0 > 0) di as text `"    - HDFE.parallel_force = {res}0"'
  - mata: HDFE.`opt' = ``opt''
  = mata: HDFE.parallel_force = 0
  - }
  - if (`verbose' > 0) di as text `"    - HDFE.`opt' = {res}``opt''"'
  = if (0 > 0) di as text `"    - HDFE.timeit = {res}0"'
  - mata: HDFE.`opt' = ``opt''
  = mata: HDFE.timeit = 0
  - }
  - if (`verbose' > 0) di as text _n `"# Parsing absorb({res}`absorb'{txt}) and initializing FixedEffects() object"'
  = if (0 > 0) di as text _n `"# Parsing absorb({res}foreign{txt}) and initializing FixedEffects() object"'
  - if (`timeit') timer on 21
  = if (0) timer on 21
  - mata: HDFE.init()
  - if (`timeit') timer off 21
  = if (0) timer off 21
  - mata: add_undocumented_options("HDFE", `"`options'"', `verbose')
  = mata: add_undocumented_options("HDFE", `""', 0)
  - if (`compact') {
  = if (0) {
    loc panelvar "`_dta[_TSpanel]'"
    loc timevar "`_dta[_TStvar]'"
    cap conf var `panelvar', exact
    if (c(rc)) loc panelvar
    mata: HDFE.panelvar = "`panelvar'"
    cap conf var `timevar', exact
    if (c(rc)) loc timevar
    mata: HDFE.timevar = "`timevar'"
    if (`verbose' > 0) di as text "## Preserving dataset"
    preserve
    novarabbrev keep `basevars' `base_clustervars' `panelvar' `timevar' `touse'
    }
  - mata: HDFE.vcetype = "`vcetype'"
  = mata: HDFE.vcetype = "unadjusted"
  - mata: HDFE.num_clusters = `num_clusters'
  = mata: HDFE.num_clusters = 0
  - mata: HDFE.clustervars = tokens("`clustervars'")
  = mata: HDFE.clustervars = tokens("")
  - mata: HDFE.base_clustervars = tokens("`base_clustervars'")
  = mata: HDFE.base_clustervars = tokens("")
  - if (`timeit') timer on 22
  = if (0) timer on 22
  - mata: estimate_dof(HDFE, tokens("`dofadjustments'"), "`groupvar'")
  = mata: estimate_dof(HDFE, tokens("pairwise clusters continuous"), "")
  - if (`timeit') timer off 22
  = if (0) timer off 22
  - if (`stop_before_partial_out') {
  = if (0) {
    if (`verbose' > 0) di as text "{title:Stopping reghdfe without partialling out}" _n
    c_local keep_mata 1
    exit
    }
  - if (`verbose' > 0) di as text "{title:Working on varlist: partialling out and regression}" _n
  = if (0 > 0) di as text "{title:Working on varlist: partialling out and regression}" _n
  - if (`verbose' > 0) di as text "# Parsing and expanding indepvars: {res}`indepvars'" _c
  = if (0 > 0) di as text "# Parsing and expanding indepvars: {res}mpg" _c
  - if (`timeit') timer on 23
  = if (0) timer on 23
  - ms_expand_varlist `indepvars' if `touse'
  = ms_expand_varlist mpg if __000000
  - if (`timeit') timer off 23
  = if (0) timer off 23
  - if (`verbose' > 0) return list
  = if (0 > 0) return list
  - loc indepvars "`r(varlist)'"
  = loc indepvars "mpg"
  - loc fullindepvars "`r(fullvarlist)'"
  = loc fullindepvars "mpg"
  - loc fullindepvars_bn "`r(fullvarlist_bn)'"
  = loc fullindepvars_bn "mpg"
  - loc not_omitted "`r(not_omitted)'"
  = loc not_omitted "1"
  - if (`timeit') timer on 24
  = if (0) timer on 24
  - mata: HDFE.partial_out("`depvar' `indepvars'", 1, 1)
  = mata: HDFE.partial_out("price mpg", 1, 1)
  - if (`timeit') timer off 24
  = if (0) timer off 24
  - if (`parallel_maxproc' > 0) {
  = if (2 > 0) {
  - if (`timeit') timer on 27
  = if (0) timer on 27
  - ParallelBoss
Unsupported method: macosx-shell
system limit exceeded - see manual
system limit exceeded - see manual
    if (`timeit') timer off 27
    }
  --------------------------------------------------------------------------------------------------- end reghdfe.Estimate ---
- Cleanup `c(rc)' `keep_mata'
= Cleanup 1000 0
  -------------------------------------------------------------------------------------------------- begin reghdfe.Cleanup ---
  - args rc keep_mata
  - loc cleanup_folder = !`keep_mata' & ("$LAST_PARALLEL_DIR"!="")
  = loc cleanup_folder = !0 & ("/var/folders/8t/wp0xr4654bx1c01fvb70fgz40000gn/T/PARALLEL_347266389"!="")
  - if (`cleanup_folder') cap mata: unlink_folder(HDFE.parallel_dir, 0)
  = if (1) cap mata: unlink_folder(HDFE.parallel_dir, 0)
  - global LAST_PARALLEL_DIR
  - global pids
  - if (!`keep_mata') cap mata: mata drop HDFE
  = if (!0) cap mata: mata drop HDFE
  - cap mata: mata drop hdfe_*
  - cap drop __temp_reghdfe_resid__
  - if (`rc') exit `rc'
  = if (1000) exit 1000
  ---------------------------------------------------------------------------------------------------- end reghdfe.Cleanup ---
-------------------------------------------------------------------------------------------------------------- end reghdfe ---
r(1000);
brendonmcconnell commented 1 year ago

I am having the same issues as others above running Stata 17.0 on a mac I ran the suggestions in https://github.com/sergiocorreia/reghdfe/issues/236#issuecomment-1480022098 and got a similar set of output to caustindavis

gen-li commented 1 year ago

I tried the parallel in Stata 18. The issue is still occurring.

Parallel information:

milypan commented 1 year ago

I ran into the same issue. The problem lies in parallel_map.ado, which should treat MacOS as just another UNIX flavor, but fails to do so.

If you go into parallel_map.ado (likely located in ~/Library/Application Support/Stata/ado/plus/p) and edit the following line:

else if ("`method'" == "unix-shell") {

to be:

else if ("`method'" == "unix-shell" | "`method'" == "macosx-shell") {

then it should work.

sergiocorreia commented 1 year ago

Hi Michael,

Thanks for looking into this! It's a bug that has been bothering users for a while but I only use win+linux so it was difficult for me to address it. Will add the required PR and then close it once I get confirmation that it works.