Open chamb244 opened 1 year ago
Hi @chamb244, this is a documentation issue (thanks for the reminder!).
Stata itself uses a different version numbering scheme to haven. Current Stata versions use format 118 for most files, and 119 when there are more than 32,767 variables.
These are mapped to version = 14 and 15 respectively - they're both current formats but only Stata/MP supports more than 32,767 variables. In Stata the restriction on the number of variables is enforced using the file format rather than the actual number of variables included, so files written with version = 15 can only be opened by Stata/MP regardless of the number of variables included.
The default for write_dta()
is version 14, which is the correct current format for files with less than 32,767, but an explanation of the difference needs to be added to the documentation.
From the spec for reference:
The format of .dta files has changed over time. Stata 17 writes what are known as .dta format-118 files and can read all formats of files that have ever been released. The recent history of .dta formats is
Format Current as of --------------------------------------- 119 Stata 15 - 17 (when dataset has more than 32,767 variables) 118 Stata 14 - 17 117 Stata 13 116 internal; never released 115 Stata 12 114 Stata 10 113 Stata 8 ---------------------------------------
The issue described at https://github.com/tidyverse/haven/issues/461 still seems to be unresolved. Error still seems contingent on version (okay before 15), as shown in below example.
Created on 2023-05-10 with reprex v2.0.2