Closed stewartwatts closed 10 years ago
Yeah, we should finally clean up all of these data sets to remove any remaining row names. I'd also like to gzip everything to save on bandwidth when downloading the package.
I want to report the same issue, using a dot on the name of the columns makes Formula fails:
julia> lm(:(Petal.Length ~ Species), iris)
ERROR: Petal not defined
in anonymous at /home/dzea/.julia/v0.2/DataFrames/src/dataframe.jl:1504
in with at /home/dzea/.julia/v0.2/DataFrames/src/dataframe.jl:1505
in anonymous at /home/dzea/.julia/v0.2/DataFrames/src/formula.jl:173
in map at cell.jl:19
in ModelFrame at /home/dzea/.julia/v0.2/DataFrames/src/formula.jl:173
in lm at /home/dzea/.julia/v0.2/GLM/src/lm.jl:37
in lm at /home/dzea/.julia/v0.2/GLM/src/lm.jl:42
I just haven't had time to get to this. Would definitely appreciate help on the simplest version: calling clean_colnames
and removing every column whose name is just ``.
Can be closed, I think.
Would it make sense to clean_colnames!() by default in data.jl?
Creating a Formula with a "." in a colname causes a somewhat cryptic error.
ERROR: Non-call expression encountered in dospecials at /home/stewart/.julia/DataFrames/src/formula.jl:68 in map at cell.jl:19 in dospecials at /home/stewart/.julia/DataFrames/src/formula.jl:72 in Terms at /home/stewart/.julia/DataFrames/src/formula.jl:128 in ModelFrame at /home/stewart/.julia/DataFrames/src/formula.jl:172 in lm at /home/stewart/.julia/GLM/src/lm.jl:37 in lm at /home/stewart/.julia/GLM/src/lm.jl:42
Formula: Fertility ~ :(+(Agriculture,Infant_Mortality)) Coefficients: 3x4 DataFrame: Estimate Std.Error t value Pr(>|t|) [1,] 21.9546 11.5285 1.90437 0.0634125 [2,] 0.208919 0.0686417 3.04362 0.00393547 [3,] 1.88563 0.535221 3.52308 0.00100803