BioStatMatt / sas7bdat

A reverse engineering of the sas7bdat database file format
83 stars 21 forks source link

found 0 rowsize subheaders where 1 expected? #2

Open floswald opened 11 years ago

floswald commented 11 years ago

Hi there! Not even sure that's a bug, but here is what i found. trying to load this file here http://www.nber.org/cbsa-msa-fips-ssa-county-crosswalk/cbsatocountycrosswalk.sas7bdat

into R with library(sas7bdat) nber <- read.sas7bdat(file="cbsatocountycrosswalk.sas7bdat")

gives me error found 0 row size subheaders where 1 expected

had excellent experience with the package so far - thanks a lot.

ClintCummins commented 11 years ago

Dear Florian,

This is what I call a "u64" file - it's the 64-bit file format, created on Linux in this case. TSP can read it just fine, but the sas7bdat code for R only reads Windows sas7bdat files at present (I think).

It would be best if the sas7bdat procedure would flag "u64" files as currently unreadable, until it can handle them. (Instead of giving this particular error message).

3293 rows, 15 columns:

SERIES COUNTYNAME 3293 obs., 1-3293, N, County Name STATE 3293 obs., 1-3293, N, ST SSACOUNTY 3293 obs., 1-3293, N, SSACD FIPSCOUNTY 3293 obs., 1-3293, N, FIPSCD MSA 3293 obs., 1-3293, N, Old MSA L 3293 obs., 1-3293, N, What does L signify? MSANAME 3293 obs., 1-3293, N, OldMSA Name CBSA 3293 obs., 1-3293, N, New MSA - if blank then rural are CBSANAME 3293 obs., 1-3293, N, New MSA Name SSAST 3293 obs., 1-3293, N, SSA State code FIPST 3293 obs., 1-3293, N, FIPS State code Y2005 3293 obs., 1-3293, N, Present in 2005 source file Y2011 3293 obs., 1-3293, N, Present in 2011 source file Y2012 3293 obs., 1-3293, N, Present in 2012 source file Y2013 3293 obs., 1-3293, N, Present in 2013 source file

                           Univariate statistics
                           =====================

*\ WARNING in command 7 Procedure MSD: Missing values for series ====> Y2005: 4, Y2011: 4, Y2012: 20, Y2013: 20

Number of Observations: 3269

                  Mean       Std Dev       Minimum       Maximum

COUNTYNAME 985.72255 548.53432 1.00000 1950.00000 STATE 27.60661 14.48031 1.00000 52.00000 SSACOUNTY 1654.46987 945.74797 1.00000 3293.00000 FIPSCOUNTY 1653.57908 945.72007 1.00000 3292.00000 MSA 151.16733 88.75680 2.00000 375.00000 L 1.11838 0.32311 1.00000 2.00000 MSANAME 210.89936 99.89398 1.00000 375.00000 CBSA 71.25084 116.94162 1.00000 390.00000 CBSANAME 70.72040 116.30502 1.00000 388.00000 SSAST 27.82043 14.55036 1.00000 52.00000 FIPST 27.85225 14.59317 1.00000 52.00000 Y2005 2005.00000 0.00000 2005.00000 2005.00000 Y2011 2011.00000 0.00000 2011.00000 2011.00000 Y2012 2012.00000 0.00000 2012.00000 2012.00000 Y2013 2013.00000 0.00000 2013.00000 2013.00000

Sincerely,

Clint

Not even sure that's a bug, but here is what i found. trying to load this file here http://www.nber.org/cbsa-msa-fips-ssa-county-crosswalk/cbsatocountycrosswal k.sas7bdat

into R with library(sas7bdat) nber <- read.sas7bdat(file="cbsatocountycrosswalk.sas7bdat")

gives me error found 0 row size subheaders where 1 expected

had excellent experience with the package so far - thanks a lot.

BioStatMatt commented 11 years ago

Florian,

I've made some changes to the read.sas7bdat function that reflect the issue Clint raised. I was able to read your file with the newer version. You can test it by sourcing the R/sas7bdat.R file.

Regards, Matt

On Wed, Apr 17, 2013 at 10:58 AM, Florian Oswald notifications@github.comwrote:

Hi there! Not even sure that's a bug, but here is what i found. trying to load this file here

http://www.nber.org/cbsa-msa-fips-ssa-county-crosswalk/cbsatocountycrosswalk.sas7bdat

into R with library(sas7bdat) nber <- read.sas7bdat(file="cbsatocountycrosswalk.sas7bdat")

gives me error found 0 row size subheaders where 1 expected

had excellent experience with the package so far - thanks a lot.

— Reply to this email directly or view it on GitHubhttps://github.com/BioStatMatt/sas7bdat/issues/2 .