Detach tool local file; include API requests

DarioHett commented 6 years ago

Hi Alex,

what do you think of detaching the tool from the basic .zip-file of the ECB homepage and instead make on-the-fly requests through the ECB API?

The http request looks like the following. A key would be D.USD.EUR.SP00.A for USD/EUR daily under the 'flowRef' EXR for Exchange Rates.

class EcbCurrencyConverter(CurrencyConverter):
    def ecb_request(key,flowRef, *pargs, **kargs):
        #get data
        url = 'https://sdw-wsrest.ecb.europa.eu/service/data/'+flowRef+'/'+key
        if kargs:
            url = url+'?'+param_url(kargs)
        print('Requested URL: '+url)
        r = requests.get(url,
                headers={'Accept':'text/csv'})
        return r

The API allows for further customisation using parameters in the URL(ECB SDMX 2.1 RESTful web service)

    def param_url(dict):
        param = ''
        for k,v in dict.items():
            param = param+k+'='+v+'&'
        param = param[:-1] #removes the last '&' sign
        return param

Returning a [[date,rate],[date, rate],...] per request or currency would look like:

    def ecb_read(ecb_request):
        r = ecb_request.content.decode().splitlines()
        c = csv.reader(r)
        date_rate = []
        for row in c:
            date_rate.append(row[6:8])
        return date_rate

And attaches here to your original (super)class

    def load_file(self, currency_file, **kargs):
        """currency_file here just takes the requested currency
        for example: 'USD'"""
        content = self.ecb_request(currency_file, 'EXR')
        self.load_lines(content)

I will have to do some work to properly set up the subclass (which might take some time due to lack of experience and work) but you get the idea.

This would also lend itself easily going forward to extend the tool to further sources and such have a more complete set of currencies.

Best regards Dario

alexprengere commented 6 years ago

Hello, Someone recently tried to do a very similar thing based on xe.com. It contains a very useful piece of code that gives you the latest rate for any currency. You can take a look at PR #12 for more details. I saw your PR #15, but I am only going to answer here since the issue and the PR are related.

The idea behind the current implementation (this might not be well documented) is to load the rates once when creating the CurrencyConverter object (about 400ms), then get near-instantaneous results when performing conversions (about 7µs). By comparison, just doing an HTTP request will probably take minimum 100ms (per conversion). This might not be a problem if you do not plan to perform thousands of conversions, but this is my use case. I took a look at the API you are using in the PR, the XML returned is not updated more than once a day I think, so actually you do not need to call it more than once a day, it effectively contains the same information as the CSV file.

Another point, since you mentioned extensibility: the format the currency converter uses is the ECB format. It is a very simple CSV format, and I think people can very easily generate a custom file containing rates from another source, then use c = CurrencyConverter('file.csv'). I believe this is more extensible than having to override the loading code to a specific API that uses a different XML grammar.

DarioHett commented 6 years ago

Hi Alex,

thank you for the feedback, especially since this is my first contribution on github.

It was just an idea of making the tool itself more lightweight in terms of memory and/or disk space, which is nonsense given your use case. This works under the assumption that the user does not need the entire data set which is shipped in the first place. I should have thought of your last point right away, manipulating the input rather than the source.

For a different approach of custom input: I stored the same data as your zip in a file storage system by currency (sample: DarioHett/centralbankdata). I would previously fetch the required data and then feed the custom list to CurrencyConverter() such that I would skip load_file() and start with load_lines() - is there some way to achieve this? Otherwise I would just write a customized load_file() which passes a list right through to load_lines().

alexprengere commented 6 years ago

Well, congratulations for you first contribution 😉 You are doing great!

If you do not need the full dataset, I suggest to just use head on the file to keep only the first lines, it should work well and reduce the disk usage. Note that the full dataset is only 1Mo uncompressed though. But I understand if you need to cut down on the initial load time of 400ms.

I am not sure I understand your last point but I will try to answer it. You can pass currency_file=None during __init__ to avoid calling load_file. Then you can call load_lines with your custom data (it expects the first line to be the header).

c = CurrencyConverter(currency_file=None)
c.load_lines(open('eurofxref-hist.csv'))

DarioHett commented 6 years ago

Your answer nailed it despite the lack of precision in my question.

Two more things: (i) Have you thought about adding control flow for passing further data in load_lines/load_file when the object has already been initialized? Obviously my idea stems from the mentioned file storage and keeping it extensible without previously creating a single input file. Just initialize with the .zip and then pass it another .csv to add a 'column' if one wishes - that is the idea. It also adds to functionality, since your currency converter may as well be used to discount values simply by attaching another list of interest rates to the existing _rates dictionary.

(ii) Several currencies have been assigned another identifier once they were redenominated (ie TRL to TRY); should there be a second 'mode' where these attach to one another? I assume you did not concatenate those lists because of your own use case.

alexprengere commented 6 years ago

I have never had those use cases so I did not really think about it.

I guess the easiest way to extend an object with new data is actually to create it from scratch again with the previous lines concatenated with new additional lines (custom code to handle this use case would have to be justified, since we are only talking about a few ms of processing time).

lines = list(open('eurofxref-hist.csv'))

c1 = CurrencyConverter(currency_file=None)
c1.load_lines(lines)

# Get other_lines from somewhere else
c2 = CurrencyConverter(currency_file=None)
c2.load_lines(lines + other_lines)

For your second point, I did not concatenate the lists because they were not concatenated in the raw data set. I guess you could merge them manually if needed, either in the CSV, or using the _rates, but I think the current system works well, e.g. to have them as two separate currencies.

alexprengere / currencyconverter

Detach tool local file; include API requests #14