ropensci / neotoma

Programmatic R interface to the Neotoma Paleoecological Database.
https://docs.ropensci.org/neotoma
Other
30 stars 16 forks source link

`get_download()` should return a matrix or data.frame for element `count.table` #21

Closed gavinsimpson closed 11 years ago

gavinsimpson commented 11 years ago

get_download() crosstabs the downloaded data with xtabs() and returns this as component count.table via the following code

count.table <- xtabs(value ~ L1 + TaxonName, sample.data)

An object of class "xtabs" isn't that useful though; invariably one would want a data frame or a matrix, but it is somewhat tedious to convert and "xtabs" object to either of these more useful classes.

Propose that get_download() returns component count.table as a matrix or a data frame.

This would involve a cast() using acast() or dcast() instead of the current use of xtabs().

SimonGoring commented 11 years ago

Thanks (and thanks for all the pushes lately!), I agree, I think the reason it wound up this way was because I was having problems with getting melt and recast to return an object that looked the way I wanted it. In part it's an issue around my familiarity (or lack of it) with the reshape2 package. I'll work on it.

gavinsimpson commented 11 years ago

@SimonGoring I was looking at another issue with get_download() and implemented a dcast() solution to this issue whilst I was at it. The issue I was looking at was the two plyr calls that create sample.data in long format. These two calls are quite slow - plyr is known for its user-friendliness, not its speed - and can be replaced with some equivalent base R calls, essentially using do.call() and rbind.data.frame().

I'll issue a pull request for the issue21 branch in my fork so you can see what I have been doing.