joshuaulrich / quantmod

Quantitative Financial Modelling Framework
http://www.quantmod.com/
GNU General Public License v3.0
809 stars 223 forks source link

279_getquote resillient to invalid symbols #282

Closed ethanbsmith closed 4 years ago

ethanbsmith commented 4 years ago

getQuote would fail if the source did not recognize any of the supplied symbols. this was particularly problematic on large batches where a single failure would cause a stop and there was no way to diagnose which symbol had caused the problem

all getQuote paths now support ";" delimited inputs All getQuote paths now return NA rows for missing quotes

ethanbsmith commented 4 years ago

ready for review. could not add tests for tiingo and av as these require a key

ethanbsmith commented 4 years ago

Good points.

I won’t be able to get to this till after market closes today. If you’re on a roll, I wont be offended if you make the changes

Also, on the Symbols <- unlist(strsplit(Symbols,',')) changes,

I think that processing can be moved up to the master getQuote function. I can do a separate ticket/pr to cover that if you agree on the approach

Not sure how to unit test the av and tiingo functions without a key though

ethanbsmith commented 4 years ago

ok, i made some changes. i think the diffs should be much cleaner now. Also, i think it's set up to consolidate the strsplit and merge code into the master function

joshuaulrich commented 4 years ago

I think you removed your functional changes, and some code that was in master (before your PR). I'll fix and merge tomorrow morning, at the latest.

Put your symbol splitting code in another branch, and I'll make sure it gets merged too.

On Mon, Nov 4, 2019, 5:12 PM Ethan Smith notifications@github.com wrote:

ok, i made some changes. i think the diffs should be much cleaner now. Also, i think it's set up to consolidate the strsplit and merge code into the master function

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/joshuaulrich/quantmod/pull/282?email_source=notifications&email_token=AAHZZWP3Y465O5VTJHFUDMDQSCT4PA5CNFSM4JIMGC3KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEDBBJDY#issuecomment-549590159, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAHZZWLSF2J2N44KNF7XV53QSCT4PANCNFSM4JIMGC3A .

ethanbsmith commented 4 years ago

i realized the merge call wasn't needed. you can just use the symbol names returned from the server. If we want to ensure that the output includes NA rows for input symbols, which isn't a bad idea, that can be centralized into the main function, after the getting the dispatched results. If you want that, I can do at the same time as the symbol splitting. I'll do that after this PR gets merged so the conflicts don't get complicated. I already opened an issue to track that

joshuaulrich commented 4 years ago

If we want to ensure that the output includes NA rows for input symbols, which isn't a bad idea, that can be centralized into the main function, after the getting the dispatched results.

We do want that, because that's the existing behavior. Removing it would break backward compatibility. Moving that to the main function should be in a separate PR, so I'm going to revert a71e1e59ed46f95a53187dd2c758637f4dafd26d in this branch.

I'm going to leave the Symbols <- unlist(strsplit(Symbols,';')) because I'm not sure your updates will work without them.

And I'm not going to make any changes until I understand my confusion, described below.


I just noticed that the getQuote functions still error if all the symbols are not valid. Your initial description said:

getQuote would fail if the source did not recognize any of the supplied symbols.

(Emphasis mine). I expected these calls to return objects that only contained missing values, but they still error.

quantmod::getQuote(c("FOO", "WYSIWYG"), src = "av", api.key = keyAV)
quantmod::getQuote(c("FOO", "WYSIWYG"), src = "tiingo", api.key = keyTiingo)
quantmod::getQuote(c("FOO", "WYSIWYG"), src = "yahoo")

Am I missing something? Maybe I don't understand the initial problem as well as I thought...


And another thing...

ready for review. could not add tests for tiingo and av as these require a key

My keys for Alpha Vantage and Tiingo are on TravisCI. They are stored in environment variables QUANTMOD_AV_API_KEY and QUANTMOD_TIINGO_API_KEY, respectively. Here's an example of how to use them. Notice that the environment variables are spelled wrong though...

https://github.com/joshuaulrich/quantmod/blob/1739987a23db6e9ec75744bcb8317b21c51d3d6f/tests/test_getSymbols.R#L4-L5

So you can write tests and they will run on TravisCI when you push. But please don't use them just yet... as I was testing this morning, I noticed that error messages contain the key in plain text. The TravisCI logs are public, and I don't want my keys displayed.

ethanbsmith commented 4 years ago

the centralized code is already in PR #285. it might simplify this PR if that one gets accepted first. not sure.

I just noticed that the getQuote functions still error if all the symbols are not valid.

(Emphasis mine). Interesting. I've never encountered that myself and didn't test for it. It's an odd scenario, but probably one we should account for as well. Are you ok if we handle that in this PR or do you want it broken out?

Thx for the info on the keys. i will add test for av and tiingo

joshuaulrich commented 4 years ago

Thx for the info on the keys. i will add test for av and tiingo

To reiterate, please don't push those tests until I'm convinced my keys won't be publicly visible in an error message.

I looked at the issue again, and I see that your example was for getQuote.yahoo(). I confirm that function throws an error if any symbol is not valid. getQuote.av() and getQuote.tiingo() add the rows for any missing symbols. That's why I was confused. It makes sense now.

I'll take a look at this again after market hours today.

ethanbsmith commented 4 years ago

I just noticed that the getQuote functions still error if all the symbols are not valid.

been stewing on this in the background all day. not as straight forward as I originally thought. it's fairly innocuous in the simple case of 1 or 2 tickers, but what about 2000 or 10000? if every single ticker you passed in is rejected, something is probably wrong. Sort of seems like a different use case from a disagreement on a subset of tickers. Not sure I have a strong view on "correct" behavior either way on this aspect and would support whatever you decide on