Closed crubb closed 7 years ago
Hi @crubb,
Nice patch.
I think the reason the tests have failed is because the new config parameter has to be configured in two files. You have made the change in defaults/config/default.dcf
which is where defaults are applied if there is no config item specified during load.project()
.
However, you'll also want to update defaults/full/config/global.dcf
so that new projects after create.project()
have the new parameter automatically.
Not sure if there are also other reasons for the test failing, but this should catch some of them. There also appear to be problems with some of the reader tests, but that might be related.
On the patch itself, what's your opinion on how to set column names if none are specified in the raw data? Does R just assign defaults, and does the user then fix this in a munge script?
@crubb the other thing you might want to do is build the project and the push a commit after the build. That will generate the actual man pages from the .Rd
files.
I do this using the devtools
package and run the commands:
setwd(<ProjectTemplateDir>)
library(devtools)
document()
install(".")
Hey @connectedblue,
R gives default names (X1, X2, X3, ..., Xn). It's the default behavior of e.g. the read.csv() function. So I suppose that should be fine.
Apparently the tests are failing, because of Undefined global functions or variables: config
, if I get that correctly. That's strange, I'm already using it in a project and it is working fine.
I had run roxygen2 on the package already, and there is a man page? What is missing?
Best regards, crubb
Oh, yes, sorry my mistake. I didn't scroll down far enough to see the generated man page.
A quick look at the tests confirms the problem: there is no load.project()
run in any of the test cases, hence no config
variable in the global env. You may be able to fix that by adding the standard boiler plate to the relevant tests:
test_project <- tempfile('test_project')
suppressMessages(create.project(test_project, minimal = FALSE))
on.exit(unlink(test_project, recursive = TRUE), add = TRUE)
oldwd <- setwd(test_project)
on.exit(setwd(oldwd), add = TRUE)
suppressMessages(load.project())
You should perhaps think about adding a new test case or two for the case where the config variable is set to FALSE
, just to make sure things don't break in the future.
You can run the tests locally by loading the testthat
library and running tests/run-all.R
. It's quick and gives some useful output to help debug where the tests are failing.
Hi @crub, Thanks for the PR. Any progress on getting the tests to pass?
Will have next week free to look further into this. Never authored a test before cough ;)
Am 28.11.2016 um 22:11 schrieb Kenton White notifications@github.com:
Hi @crub https://github.com/crub, Thanks for the PR. Any progress on getting the tests to pass?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/johnmyleswhite/ProjectTemplate/pull/168#issuecomment-263395375, or mute the thread https://github.com/notifications/unsubscribe-auth/ABgEdR-mBP80Oysi2l9ok1p0EXAlJjz2ks5rC0NlgaJpZM4K8-q6.
It looks scarier than it actually is. If you run the tests locally, you'll get a readout of which ones ar efailing because the config
variable doesn't exist. The boiler plate above should fix that and you can then see the wood for the trees.
Also, if you get a chance, could you look at the thread at the bottom of #169 and see if you can see what's going on there. I don't think it's related to the sqlite patch you posted earlier in the year, but you probably have more knowledge than me of odbc reading.
Hi @crubb, sorry to interject while you're in the middle of this, but I've had an additional thought on this feature.
I don't think the config file parameter is the best solution for this problem. The first reason is that it is a reader specific config and there are lots of those that live in reader files rather than globals.dcf
. The second reason is that it won't help for a project where you have two csv data sources - one with a header and one without.
I think the better home for this config parameter is in the .file
definition where type
is set to csv
and an additional parameter header
is passed into the csv.reader
as ...
options or similar.
The file.reader.R
would need to be re-architected to support this behaviour, but I think it would lead to more flexible specification of data files to load.
What do you think? I'm happy to chip in and propose a new structure for file.reader.R
if it helps.
@crubb @connectedblue I'm happy with a global variable for whether to include headers in csv and xslx files. Currently you can work around this by assigning a global environment default for the csv reader.
I understand that whether to include headers could be a file by file issue. I think the purpose of this PR and discussion is about making explicit the default behavior for these readers. Currently we default to reading headers and this default is not well documented. Making a global config variable makes this default explicit and hopefully easier for most ProjectTemplate users to work with.
Fair point, I'd forgotten about the xls case when I made my comment. This is in fact a global option across a number of reader types.
There is still the issue of projects that have both types of data (maybe a CSV file with a header and an excel without) but this is an edge case.
Sorry @crubb, I'll let you get back to your PR.
Hey guys,
I finally came back to the PR after a few months, sorry ;) The tests run fine locally now, but Travis fails on:
checking for sqlext.h... no
configure: error: "ODBC headers sql.h and sqlext.h not found"
ERROR: configuration failed for package ‘RODBC’
I don't really think that is related to my code changes?
Best regards, crubb
Hi @crubb
I'm sure I've seen this before from travis - it seems to have days when it doesn't like it's own build.
This sounds like crappy advice, but try again tomorrow. You can force a retest by closing and re-opening the PR.
I also have travis configured against my ProjectTemplate
clone so you're welcome to raise a test PR against connectedblue/ProjectTemplate2:master
to see if it passes.
But you're right - there's no obvious link between that error and your code
This sometimes happens. Travis builds a new environment from scratch and must download all of the required packages and unix programs. Sometimes the site hosting the package or program is down, which causes an error. No worries as I can usually restart the build and then it will pass. I'll let you know if that works here.
Ok I've fixed the problem with Travis. The fix requires that you merge the current master into your branch and then run the PR again.
That worked for both of my current PRs, thank you!
Hi @crubb - good news that the build works cleanly now.
Sorry - only just spotted something: Can you update website/configuring.markdown
also with the new variable?
I have a vague plan of being able to generate the man page and website documentation from one source for config, but for now both places need to be manually kept in sync. It's pretty much the same format as the man page so should be a cut and paste job.
@crubb I'm a bit worried about the behavior on an old project when migrate.project()
has not been run. In that instance, config$data_loading_header
will be NULL
, which will through an error in read.csv()
:
Error in !header : invalid argument type`
This could cause an old project to fail suddenly. Of course the user will be warned that they should run migrate.project()
.
How do you think this edge case (where a user upgrades but does not run migrate.project()
should be handled?
@crubb never mind -- I've done some more testing and digging, and the .load.config()
function will handle the edge case. Merging and closing. Thanks!
Hey,
currently, all CSV, TSV, WSV and XLSX files are read with
header = TRUE
. In this pull request I propose to still treat this as a default, but also enable users to specifyheader = FALSE
.Best regards, crubb