tomroh / bcputility

R package for fast bulk imports/exports from/to SQL Server with the bcp command line utility
https://bcputility.roh.engineering
Other
13 stars 1 forks source link

capture errors #15

Closed mkoohafkan closed 2 years ago

mkoohafkan commented 2 years ago

Sometimes bcp can fail, but in the current design bcpImport() will only throw an error if the system2() call is malformed. If bcp runs but fails for any other reason, there is no way to tell as bcpImport() does not return any value and there is no checking of either the code returned by system2() or examining the output. There are ways to capture the output of system2() via the stdout and stderr arguments, and the processx package provides an enhanced version of system2() that may be easier to work with.

For example, bcp was writing

SQLState = 22005, NativeError = 0
Error = [Microsoft][ODBC Driver 17 for SQL Server]Invalid character value for cast specification

but because I am calling it from a function, inside a larger routine, I never saw that output until I started noticing a bunch of zero-row tables and reran the commands manually.

tomroh commented 2 years ago

The ... argument is provided to for this purpose. If you name the arguments, they are passed through to system2. stdout and stderr can be captured in a file or as a character vector which you could use for error handling in your wrapper function.

mkoohafkan commented 2 years ago

That only works if the user provides file paths to write stderr and stdout to. There is currently no way to capture the output as a character vector, which both system2() and processx::run() support. Reading and parsing files to determine error codes is suboptimal.

The solution might be as simple as assigning the result of the system2() call to a variable and return()ing that variable at the end; if bcp is well-behaved the call should return 0 if it succeeded and some other value if it did not. This should also be compatible with the maxerrors argument, i.e., hopefully the system2()call will return 0 even if bcp errors, provided the total number of errors is below maxerrors.

tomroh commented 2 years ago

So the issue is that maxerrors returns a system2() error which stops the function but if maxerrors is not reached then it continues on silently? Wouldn't setting maxerrors to 0 fail fast and allow you to use tryCatch to find what failed?

mkoohafkan commented 2 years ago

No, my statement regarding maxerrors was speculation. The issue is that errors from bcp are not returned by bcpImport without writing and parsing files passed as stdout/stderr arguments . The only way bcpImport can return an error directly is if the system2() call is malformed or something goes wrong with any of the dbExecute calls. If bcpImport could return the integer result of the system2() call (rather than nothing, as is the current implementation) that might be sufficient to identify if bcp succeeded. I can test that but I won't be able to do so until Monday or Tuesday.

tomroh commented 2 years ago

005b9a6ec88209c97f689ff7af69312f4df3751f

This will be in the next release. The output of bcpImport now mimics the output of running system2('bcp'...).