currencybot / open-exchange-rates

Open Exchange Rates API - free / open source hourly-updated currency data for everybody
http://josscrowcroft.github.com/open-exchange-rates/
190 stars 19 forks source link

Incorrect data #30

Closed nickray closed 11 years ago

nickray commented 12 years ago

It seems sometimes you give currency in USD, and sometimes USD in currency (i.e., the inverse).

The example below is from my Python wrapper, it's RUB and CHF on 2012-01-03 and 2010-01-03.

In [36]: fxrates.alternative("RUB", date=120103) RUB = 31.687000, CHF = 0.932150, base USD Out[36]: 33.993455988842996

In [37]: fxrates.alternative("RUB", date=100103) RUB = 0.032990, CHF = 1.035750, base USD Out[37]: 0.03185131547188028

currencybot commented 12 years ago

NB: skim-readers can skip to https://github.com/currencybot/open-exchange-rates/issues/30#issuecomment-5172066 for latest update!


Hey, I haven't heard of this happening before - all the open exchange rates data is stored in JSON files though - so if you're able to pinpoint where those inverse rates are coming from (which files), that would be very helpful in figuring out whether something is wrong.

Let me know if you can find them! Thanks

nickray commented 12 years ago

What a rapid response!

Here are the direct links (see the entry RUB for instance): https://raw.github.com/currencybot/open-exchange-rates/master/historical/2012-01-03.json https://raw.github.com/currencybot/open-exchange-rates/master/historical/2010-01-03.json

currencybot commented 12 years ago

Hey, thanks for that. I didn't notice you mentioned the dates before.

So that's a strange one, because many of the currencies seem accurate. The 2010 file is from our bulk import of historical data, which came from the same source, but hence the issue where timestamp is a string (not integer as it should be)

Thanks for bringing this up - I'll investigate this week and see if we need to recrawl the historical data.

For what it's worth, are the latest values accurate? Most important right now that the realtime exchange rates are OK.

Also - your python wrapper sounds legit, are you able to publish it anywhere?

Cheers,

nickray commented 12 years ago

I sent you a pull request for my Python wrapper, use at your own convenience.

The problem is that in fact the latest values are incorrect, at least for RUB and CHF.

Here's some data for RUB: 2011-11-05 00:00:00 0.03255 2011-11-06 00:00:00 0.03255 2011-11-07 00:00:00 0.03275 2011-11-08 00:00:00 31.4657 2011-11-09 00:00:00 0.03278 2011-11-10 00:00:00 0.03265 2011-11-11 00:00:00 0.03301 2011-11-12 00:00:00 0.03301 2011-11-13 00:00:00 0.03301 2011-11-14 00:00:00 0.03271 2011-11-15 00:00:00 30.75605 2011-11-16 00:00:00 30.76028811 2011-11-17 00:00:00 30.81582305 2011-11-18 00:00:00 30.93696145 2011-11-19 00:00:00 30.80805633 2011-11-20 00:00:00 30.86879322 2011-11-21 00:00:00 31.14126565

Your scraper script should definitely do some plausibility checks (e.g, is the number different than the last, has the price not moved by more than say twice the median daily price move).

Let me tell you, financial data is always dirty ;-). That's why the vendors can afford to charge ridiculous prices.

Cheers

currencybot commented 12 years ago

Excellent info, thanks for the rundown on the dodgy data. I'll check out the PR tomorrow, cheers for that.

Bumped this up to tomorrow's to-do list (it's bedtime here) and will get back to you once I've investigated.

There are a lot of extra things the script needs to do, actually. Sanity checks are a big one, but also cross-checking different providers for discrepancy would be ideal. I have a lot of free time coming up so I'll be implementing a bunch of those as separate processes (for now, they'll simply flag for review when something looks odd).

Thanks again!

wjcrowcroft commented 12 years ago

Hey, didn't realise I was writing from CurrencyBot's account! (was replying in Gmail) ... he won't be happy about that for sure.

I've checked out the latest files and looks like the realtime data is grabbing the correct values:

CHF: 0.91356,
RUB: 29.576,

CHF: compare 0.914101849 from https://www.google.com/search?q=1+USD+in+CHF RUB: compare 29.577923 from https://www.google.com/search?q=1+USD+in+RUB

But you're right about the historical values (and implementing sanity checks nonetheless) so I'm leaving this one open until I can get the historical data figured out.

tl;dr for speed-readers:

latest.json rates and historical daily rates going back to 2011-11-16 are unaffected.

affected currencies: RUB and CHF (so far) in historical data before 2011-11-16