ProjectMOSAIC / fetch

Functions to fetch data
1 stars 2 forks source link

data file documentation through fetchData() #2

Open rpruim opened 9 years ago

rpruim commented 9 years ago

From @dtkaplan on October 19, 2012 18:47

I propose to add a documentation=TRUE argument to fetchData() that will cause a corresponding documentation file to be display. Such files can give descriptions, etc. of the data. The author would upload a documentation file in an appropriate format.

Copied from original issue: ProjectMOSAIC/mosaic#169

rpruim commented 9 years ago

Another option would be to take advantage of comment characters to make the files self documenting. That way there is only one file to deal with.

In any case, I think fetching documentation should be a separate function (fetchDoc) instead of a flag.

And do we know if anyone else besides Danny is using this feature?

rpruim commented 9 years ago

From @nicholasjhorton on October 20, 2012 13:3

By feature, do you mean using fetchData()? I am, but that's because it's necessitated by the instructions in the second edition of the textbook and the AcroScore problems.

Overall, it works well for the built-in datasets.

Nick

On Oct 19, 2012, at 3:14 PM, Randall Pruim notifications@github.com wrote:

Another option would be to take advantage of comment characters to make the files self documenting. That way there is only one file to deal with.

In any case, I think fetching documentation should be a separate function (fetchDoc) instead of a flag.

And do we know if anyone else besides Danny is using this feature?

— Reply to this email directly or view it on GitHub.

Nicholas Horton Department of Mathematics and Statistics, Smith College Clark Science Center, Northampton, MA 01063-0001 http://www.math.smith.edu/~nhorton

rpruim commented 9 years ago

From @nicholasjhorton on March 18, 2013 19:46

We agreed to deprecate fetchData() in the next release (potentially over a 2 year period): this just gets added to the NEWS for now, but might eventually involve adding a warning when it is used.

rpruim commented 9 years ago

From @nicholasjhorton on March 19, 2013 10:43

Danny wrote: After falling asleep as soon as I got back to the room, I woke up at 10:30 with a simple idea to make fetchData() useful to instructors who want a simple way to post their own data.

• Instructors who want to use the system create their own file server, e.g. using Dropbox's public directory. They will put their files on that server.  
• They make up a short name for use by fetchData(), e.g. "NJH"
• They email that short name to me, together with a link to a file on their server.  I will then check it for uniqueness, create a directory by that name on the mosaic server and put the address of their server in a simple text file called "redirectName.txt"
• Instructors can then put whatever files on their own server.  A file with an address of [server name]/file.csv would be referred to as fetchData("NHJ/file.csv") 

I've prototyped this using the mosaic-web.org server, creating a NJH account which I've redirected to one of my non-fetchData() dropbox directories. (I don't have access to your server, but if you send me a link to a CSV file on your server, I'll update NJH to go to your server.)

Here's the prototype: remoteFetch() in the attached file fetchRemote.R. Ignore the name; it would be folded into fetchData()

Once you've sourced remoteFetch.R, you can try these commands with the files being served from a vanilla Dropbox directory on my account:

remoteFetch("NJH/Course1/mydata2.csv") weather when 1 snow night 2 sun day 3 rain tomorrow remoteFetch("NJH/mydata1.csv") Who Age 1 Bill 3 2 Charley 4 3 Debby 5

Regards, Danny

rpruim commented 9 years ago

How close are we to making fetchData() of general usefulness to others?

rpruim commented 9 years ago

Another stranded issue regarding fetchData(). This issue is less important to me than fixing what I consider to be bugs (things like side effects in the environment) and limited general usability.

rpruim commented 9 years ago

From @nicholasjhorton on February 26, 2014 16:40

I’m increasingly frustrated with fetchData(), in that it makes it hard to reference datasets, since students will often interchange things like:

ds = fetchData(“KidsFeet”)

or

kids = fetchData(“KidsFeet”)

then wonder why their later commands don’t work (using the other name).

Can we move to referencing dataframes from packages?

Just my $0.02,

Nick

On Feb 26, 2014, at 11:09 AM, Randall Pruim notifications@github.com wrote:

Another stranded issue regarding fetchData(). This issue is less important to me than fixing what I consider to be bugs (things like side effects in the environment) and limited general usability.

— Reply to this email directly or view it on GitHub.

Nicholas Horton Professor of Statistics Department of Mathematics and Statistics, Amherst College Box 2239, 31 Quadrangle Dr Amherst, MA 01002-5000 https://www.amherst.edu/people/facstaff/nhorton

rpruim commented 9 years ago

I don't use fetchData(). I use data in packages or read.file() with a reasonable URL. Lately, I've also been using read.xls() some to read data directly from excel files. But note that these latter options suffer from the same naming issues that Nick is concerned about.

I think use of fetchData() should be removed from our public documents except to document how fetchData() works, and then only if it is useable by others for quick delivery of "late breaking" data. For other things, we should be more stable and have well documented data in packages.

If this is only useful as a way for Danny and a few others to distribute their data, then we should put it in a separate package, and document it as a tool for that more limited purpose.

I'd like to get this resolved by the end of March. We've been dancing around this for too long.

rpruim commented 9 years ago

Seems we are still dancing around fetchData(), but since we are relying on it less and less, and I never use it, I'm going to move this into "dormant ideas" and open a new issue about whether fetchData() has a role (and should be corrected to serve the role, or whether we should just remove fetchData() altogether.