Crypto-toolbox / bitex

Crypto-currency Exchange API Framework
MIT License
484 stars 134 forks source link

Implement BitexResponse object, including formatted Exchange data #129

Open deepbrook opened 6 years ago

deepbrook commented 6 years ago

As discussed in #107 , the addition of a ResponseObject is favorable.

However, since BitEx is primarily an API wrapper, the custom Response object should not get in the way of the user - it needs to feel, look and handle like a requests.Response object.

One way to achieve this, is to store formatted data and the original requests.Response object in a new class, which additionally exposes all public attributes of the requests.Response as properties using the @property decorator.

Update: The class has been implemented with #140, thanks to @bebosudo ! Also thanks to @mattisback for the valuable feedback. Currently,the following methods have a formatter method implemented.

bebosudo commented 6 years ago

We could place a class inside the base interface class, something like:

class RESTInterface(..):

    class FormattedResponse(requests.Response):

        @property
        def formatted(self, request_response):
            raise NotImplementedError

and then each exchange interface re-implements it according to the kind of ticker response it receives:

class ThisExchange(RESTInterface):

    class ThisExchangeFormattedResponse(RESTInterface.FormattedResponse):

        @property
        def formatted(self, request_response):
            j = request_response.json()
            return (j["ask"], j["bid"], None, ..., j["time"])

    def ticker(self, pair, ...):
        req = self.request(...)
        return ThisExchangeFormattedResponse(req)

warning: code not tested, here for discussion only. (Explicitly calling the __init__ method of the parent class inside the class FormattedResponse should not be needed, since there should not be any operation in the init phase).

I'm unsure whether the inheritance from class ThisExchangeFormattedResponse(RESTInterface.FormattedResponse) could work.

Anyway, if you think it can be a good idea, I'll dig into it and propose an implementation. Otherwise feel free to suggest a better alternative.

MatthewCochrane commented 6 years ago

I started a spreadsheet for comparing the different responses. I'll keep working on it when I get some more time.

https://docs.google.com/spreadsheets/d/1jMkXpAGmKcGR0Dm9fQNYv85lunIuVjDUGE2qcB_J2hE/edit?usp=sharing

I've done most of the ticker responses so far and it seems like a good set of 'common' results would be:

All of the exchanges I've tested implement all of the above, with the exception of a few which don't implement timestamp (don't really need timestamp anyway, can just use time of the request - though might be worth specifying that the time is 'inaccurate' then).

bebosudo commented 6 years ago

Great job, @mattisback. I've contributed to your spreadsheet by adding a ticker on binance.

@nlsdfnbch any idea on how to proceed? do you approve the idea of the abstract wrapper class around requests, inside the RESTInterface class?

deepbrook commented 6 years ago

@bebosudo, I don't see the benefit of defining the class inside the interface class. But putting them in the same file is okay. I'll see to it that I'll define some tests for it today.

@mattisback i had a look at your spreadsheet, and can add that vaultoro only trades bitcoin/gold, which is why you probably weren't able to get a ticker out of it. Regarding the timestamp, you could define two timestamp fields, api_ts and client_ts, which hold both timestamps if available. Additionally, focus on the standardized methods endpoints, as the other methods are, IMO too sporadically implemented.

MatthewCochrane commented 6 years ago

Had a crack at implementing some of this. Let me know what you think of this approach.

Added a folder called formatters under interface:

+ bitex
  - api
  + interface
     + formatters
       - __init__.py
       - base.py
       - binance.py
       - bitfinex.py
       - ...

bitex/interface/formatters/base.py

import abc
import requests

class FormattedResponse(metaclass=abc.ABCMeta):
    """Formatted Response base class"""

    def __init__(self, method, params, response):
        assert (isinstance(response, requests.Response))  # can't remember if this is good practice in python
        self.method = method  # need to know the type of method so we know how to format it, just a string
        self.method_params = params  # do we need this?
        self.raw = response  # could be called response instead of raw

    @property
    def formatted(self):
        func_name = '_format_' + self.method
        func = getattr(self, func_name, None)
        if func is None:
            raise NotImplementedError
        return func(self.raw)

    @abc.abstractmethod
    def _format_ticker(self, response):
        raise NotImplementedError

    # ... define other abstract methods here

An example exchanges implementation (for one function at least) bitex/interface/formatters/bitfinex/py

from bitex.interface.formatters import FormattedResponse
from datetime import datetime

class BitfinexFormattedResponse(FormattedResponse):

    def _format_ticker(self, response):
        response_data = response.json()
        return {
            'timestamp': datetime.fromtimestamp(float(response_data['timestamp'])),
            'bid': float(response_data['bid']),
            'ask': float(response_data['ask']),
            'low': float(response_data['low']),
            'high': float(response_data['high']),
            'volume': float(response_data['volume']),
            'last': float(response_data['last_price'])
        }

    def _format_order_book(self, response):
        pass

    # etc...

Then the magic... To make it format a response use the following decorator, defined in bitex/utils.py

def format_response(func):
    """Formats a requests.Response object, wrapping it in a FormattedResponse object
    :param func     the function being called (eg. ticker or order_book)
    """
    @wraps(func)
    def wrapped(self, *args, **kwargs):
        """Wrap function."""
        try:
            class_name = self.__class__.__name__
            formatter_name = class_name + "FormattedResponse"
            formatter_class = getattr(sys.modules['bitex.interface.formatters'], formatter_name)
        except AttributeError:
            raise NotImplementedError('Formatter \'' + formatter_name + '\' does not exist')

        # run the function
        response = func(self, *args, **kwargs)

        return formatter_class(func.__name__, args, response)

    return wrapped

Then all you need to do is add the decorator in the exchange interface files. eg. bitex/interface/bitfinex.py

...
    @format_response  # <- wraps output in FormattedResponse (specifically BitfinexFormattedResponse)
    def ticker(self, pair, *args, **kwargs):
        """
        Return the ticker for the given pair.
        """
        payload = {'symbol': pair}
        payload.update(kwargs)
        return self.request('GET', 'v1/ticker/24hr', params=payload)
...

Lastly, it can then be used like this:


formatted_response = bitex.Bitfinex().ticker(bitex.BTCUSD)
print(formatted_response.formatted)
# {'timestamp': datetime.datetime(2017, 12, 18, 20, 35, 44, 526019),
#  'bid': 18629.0,
#  'ask': 18630.0,
#  'low': 18010.0,
#  'volume': 63690.87027664,
#  'last': 18630.0,
#  'high': 19891.0}

# note we can still see all of the original response data with formatted_response.raw
print(formatted_response.raw.json())
# {'bid': '18629.0',
#  'timestamp': '1513589744.5260189',
#  'ask': '18630.0',
#  'mid': '18629.5',
#  'low': '18010.0',
#  'volume': '63690.87027664',
#  'last_price': '18630.0',
#  'high': '19891.0'}
deepbrook commented 6 years ago

I like the principle implementation - however, see the unit test I uploaded to see what the changes required specifically are (it's mainly naming and access)!

MatthewCochrane commented 6 years ago

Yep, I'll pull your unit test and make some modifications to get it to pass.

bebosudo commented 6 years ago

Great work @mattisback, just a couple of review notes on the excerpt of formatter you showed here (had no time yet to review the rest):

  1. Don't use float() to convert the value from the string stored in the json, use decimal when handling money https://docs.python.org/3.6/library/decimal.html. Are you already using it @nlsdfnbch somewhere else in the library?

  2. assert is meant for "debugging" code on conditions that should never happen; they are also removed if code is optimized. I'd use an exception instead, sth like a ValueError.

  3. using double quotes could improve readability: from 'Formatter \'' + formatter_name + '\' does not exist' to "Formatter '" + formatter_name + "' does not exist". Or even better, using the format method of strings: "Formatter '{}' does not exist".format(formatter_name)

  4. The following docstring is useless:

    def wrapped(self, *args, **kwargs):
    """Wrap function."""

On the design, I don't see the point of placing the formatters in a separate module, and having another file for each formatter. using getattr mixed with sys like in:

class_name = self.__class__.__name__
formatter_name = class_name + "FormattedResponse"
formatter_class = getattr(sys.modules['bitex.interface.formatters'], formatter_name)

It sounds a bit like a "hack" and more error-prone to me, but maybe is because I'm not used to metaprogramming.

I'll try to implement a different version tomorrow, based on a FormattedResponse abstract class, as shown in my previous comment.

Merry Christmas! :smile:

MatthewCochrane commented 6 years ago

Just having a look at your unit test, a couple of things:

  1. not so important but perhaps InterfaceResponse is a better name than APIResponse as it works with the interface package not the API package (to avoid being confusing). Perhaps some other name?
  2. I've come up with a way to reflect all of the attributes from the requests.Response class onto the APIResponse object and to get your tests to pass but it required modifying how the mock object was built in your test_class_instance_handles_like_requestsResponse_instance test.

Have a look at the implementation here: https://github.com/mattisback/bitex/tree/FormattedResponse_to_APIResponse

MatthewCochrane commented 6 years ago

@bebosudo Thanks for the feedback!

  1. Ok great, I'll replace float with decimal
  2. Thanks. That makes sense. Will update. Is it worth having the runtime check do you think?
  3. Good suggestion.
  4. Haha yep. I just copied the wrapper above it as a template. Will remove from both.

The separate files separate out distinct concepts and and help avoid big, messy files which would be the case if we put the formatters in the same files as the interface implementations. Especially if it was a sub-class. It is a bit of a hack (or at least, might make it a little confusing to follow how the formatter works for people new to the repo), but it makes the code in the interface implementations very clean and separates out distinct concepts, making the code more modular. If you have time, have a look at the rest of my implementation. I've got it all working, with unit tests for ticker formatting on most exchanges (not the cleanest tests, but a starting point). The unit tests could be usable still if you have a go at an alternate implementation.

bebosudo commented 6 years ago
  1. I don't think is needed. "If it walks like a duck and it quacks like a duck, then it must be a duck."

The separate files separate out distinct concepts and and help avoid big, messy files which would be the case if we put the formatters in the same files as the interface implementations. Especially if it was a sub-class.

Thanks @mattisback, I used your approach and I separated the formatters and the interfaces modules. It took me a while, but I think I figured out a clean way to solve the problem, see PR #140.

I created a AbstractFormattedResponse class in bitex/formatters/base.py, which uses delegation to store the original requests' response object, and __getattr__ to retrieve methods and properties from the original response:

class AbstractFormattedResponse(requests.Response):

    def __init__(self, response_obj):
        self.response = response_obj

    @property
    def formatted(self):
        raise NotImplementedError

    def __getattr__(self, attr):
        """Use methods of the encapsulated object, when not available in the wrapper."""
        return getattr(self.response, attr)

In the same file I also created a namedtuple to store results in a "standard" and static way.

FormattedResponseTuple = namedtuple("FormattedResponse", ("bid",
                                                          "ask",
                                                          "high",
                                                          "low",
                                                          "last",
                                                          "volume",
                                                          "timestamp"))

Then we can easily create a formatter for each exchange (e.g., bitex/formatters/bitfinex.py):

class BitfinexFormattedResponse(AbstractFormattedResponse):

    @property
    def formatted(self):
        # {'bid': '14827.0', 'ask': '14828.0', 'high': '14961.0', 'low': '11730.0',
        #  'last_price': '14829.0', 'volume': '87898.99047435', 'timestamp': '1514042460.7861257',
        #  'mid': '14827.5'}
        data = self.json()
        return FormattedResponseTuple(bid=Decimal(data["bid"]),
                                      ask=Decimal(data["ask"]),
                                      high=Decimal(data["high"]),
                                      low=Decimal(data["low"]),
                                      last=Decimal(data["last_price"]),
                                      volume=Decimal(data["volume"]),
                                      timestamp=fromtimestamp(float(data["timestamp"]))
                                      )

I've also used some black magic and created a decorator that can be applied to the ticker methods, to ease and make the process appear cleaner. It's stored in bitex/utils.py and it's a bit more complex than a normal decorator because I wanted to be able to pass it as a parameter the class in which to encapsulate the result:

# The following is a decorator which accepts a parameter, which should be a class that encapsulate
# a response and add useful methods, such as formatter.
def format_with(class_to_use):
    """Pass a class to be used as a wrapper in the innermost function."""
    def real_decorator(function):

        @wraps(function)
        def wrapper(*args, **kwargs):
            return class_to_use(function(*args, **kwargs))

        return wrapper

    return real_decorator

And eventually, it all reduces to apply the decorator and pass it the customized class (on bitex/interfaces/bitfinex.py):

    @format_with(BitfinexFormattedResponse)
    @check_and_format_pair
    def ticker(self, pair, **endpoint_kwargs):
        """Return the ticker for a given pair."""
        self.is_supported(pair)
        if self.REST.version == 'v1':
            return self.request('pubticker/%s' % pair)
        return self.request('ticker/%s' % pair, params=endpoint_kwargs)

An usage example is:

>>> from bitex import Bitfinex
>>> resp = Bitfinex().ticker("btcusd")
>>> resp.formatted
FormattedResponse(bid=Decimal('14661.0'), ask=Decimal('14676.0'), high=Decimal('14961.0'), low=Decimal('11730.0'), last=Decimal('14661.0'), volume=Decimal('85
803.14989423'), timestamp=datetime.datetime(2017, 12, 23, 16, 40, 51, 487780))

What do you think? Any suggestion? Unfortunately I still haven't tried the unit tests, but if you think this could be a good way to solve the problem, I'll go through the tests provided.

MatthewCochrane commented 6 years ago

Hi @bebosudo Good work. Some great ideas in there. I like the idea of the parameterised decorator

@format_with(BitfinexFormattedResponse)

as opposed to my 'magic' approach to choosing a formatter. Makes it more verbose and obvious what the code does, although it will introduce some code duplication (ie you'll need to specify BitfinexFormattedResponse) for every method that you want to format. Perhaps a middle of the road solution is to define the formatter in the class somewhere (eg in the constructor) and then access that in the format_response wrapper from the self parameter.

What I don't understand with your method is how are multiple functions handled? For example, your code lets us format the ticker method but what about others? for example the return value from order_book will be very different to that from ticker so you can't just have a single formatted method. You need one method for each method that it will format. In addition to that, sometimes in the formatter you need to know what parameters you passed in, for example see my poloniex formatter https://github.com/mattisback/bitex/blob/06f787e6793e25d4e28810ae42f30d9faea591d9/bitex/interface/formatters/poloniex.py

So far in my branch I have working: all ticker formatters, support for multiple functions per exchange, integration with @nlsdfnbch 's desired formats (his unit test passes), fully working unit tests.

Let me know if you want me to make any changes to my branch or to submit a pull request and I will. For now I don't have the time to redo all the work I've done on another branch so I'll continue working on mine for now but if the other approach is preferable then I'll happily come back to contributing to this repo at some point.

deepbrook commented 6 years ago

Hey lads! First off, thanks for all your work! And sorry I haven't looked through all of this yet, but Christmas is busy time at our house- but I'll be sure to look over everything before new year's! Until then : merry holidays!

bebosudo commented 6 years ago

What I don't understand with your method is how are multiple functions handled? For example, your code lets us format the ticker method but what about others?

It's simply because so far I'm using only the ticker methods to collect prices, and didn't know that also other methods needed formatting. 😂

Let's wait for a feedback from @nlsdfnbch, before going on with this code battle 😉

Merry Christmas to all of you!

MatthewCochrane commented 6 years ago

@bebosudo sounds good, lets listen for @nlsdfnbch 's input :) And merry (belated) xmas to both of you, hope you have a good holiday!

deepbrook commented 6 years ago

Right! So - I finally had some time to look over both of your excellent contributions - thanks again for investing so much time and effort into this little project :) 👍

As for the implementation method, I'm strongly leaning towards @bebosudo 's implementation.

The issue of implementing the various formatters for each endpoint, could be handled as follows: implementing a formatter in each FormattedResponse class with the same name as the method it formats. (which is somewhat similar to @mattisback approach). So AbstractFormattedResponse would look like this:

class AbstractFormattedResponse(requests.Response):

    def __init__(self, method, response_obj):
        self.response = response_obj
        self.method = method

    @property
    def formatted(self):
        """Return the formatted json data."""
        getattr(self, self.method)

    def __getattr__(self, attr):
        """Use methods of the encapsulated object, when not available in the wrapper."""
        try:
            return getattr(self.response, attr)
        except AttributeError:
            return getattr(self, attr)

    def ticker(*args, **kwargs):
        raise NotImplementedError

    def order_book(*args, **kwargs):
        raise NotImplementedError

    def trades(*args, **kwargs):
        raise NotImplementedError

    def bid(*args, **kwargs):
        raise NotImplementedError

    def ask(*args, **kwargs):
        raise NotImplementedError

    ...  # etc.

And then all we need to update would be the wrapper, like so:


def format_with(formatter):
    """Pass a class to be used as a wrapper in the innermost function."""

    def real_decorator(function):
        @wraps(function)
        def wrapper(*args, **kwargs):
            return formatter(function.__name__, function(*args, **kwargs))

        return wrapper

    return real_decorator

What do you guys think ?

MatthewCochrane commented 6 years ago

Cool. Fair enough if you prefer the @format_with(FormatterClass) approach over @format_response for the decorator. I mean there's not a huge difference, it's a trade off between being easy to understand vs code duplication. I think you might be right that the ease of understanding is worth the extra code duplication.

Then there's the aliasing of the underlying response up to the FormattedResponse object.

def __getattr__(self, attr):
    """Use methods of the encapsulated object, when not available in the wrapper."""
    try:
        return getattr(self.response, attr)
    except AttributeError:
        return getattr(self, attr)

vs

    self.__dict__ = response.__dict__

Yours seems like a fine way to do it.

I kind of think a prefix on the function names could be a good idea. At the moment I use _formatted_ticker, _formatted_ask, etc. rather than just ticker or ask. There's no clash between the formatter names and requests.Response members at the moment but who's to say there won't be one in the future? For example, if requests.Response had a parameter called ask (or it ever gets added) then which function should FormattedResponse call when we call ask? It's method or the one in self.response? Anyway, worth consideration..

The other thing missing is passing through the arguments of the call in the interface through to the formatter. Take for example Poloniex. When we call .ticker(currency) on a Poloniex object, the response contains the ticker for every pair on the exchange. When we go to format it how will we know which pair the user requested if we don't have access to the parameters called to the ticker function?

Have a look at my current implementation of _format_ticker for Poloniex:

    def _format_ticker(self, response):
        response_data = response.json()
        d = response_data[self.called_method_params[0]]
        return {
            'timestamp': response.receive_time,
            'bid': float(d['highestBid']),
            'ask': float(d['lowestAsk']),
            'low': float(d['low24hr']),
            'high': float(d['high24hr']),
            'volume': float(d['baseVolume']),
            'last': float(d['last'])
        }
        # {'BTC_CLAM': {'id': 20, 'highestBid': '0.00058379', 'percentChange': '0.32601675', 'low24hr': '0.00042656',
        #               'baseVolume': '77.25725539', 'high24hr': '0.00058657', 'quoteVolume': '152023.08084615',
        #               'lowestAsk': '0.00058719', 'isFrozen': '0', 'last': '0.00058720'}, '
        #  BTC_XPM': {'id': 116, 'highestBid': '0.00002168', 'percentChange': '0.02300469', 'low24hr': '0.00002107',
        #             'baseVolume': '8.41716666', 'high24hr': '0.00002299', 'quoteVolume': '383520.44957097',
        #             'lowestAsk': '0.00002178', 'isFrozen': '0', 'last': '0.00002179'},
        #  ... (for each coin!)
        # }
deepbrook commented 6 years ago

Concerning the args & kwargs, I think it'd be easiest to simply pass them along into the constructor and save them; that way they can be accessed by the formatting method if necessary?

def __init__(self, method, response, *args, **kwargs):
    self.method = method
    self.response = response
    self.method_args = args
    self.method_kwargs = kwargs

And regarding the possibility of the method name or attribute being implemented into requests.Response, I think that's a risk we can live with.

bebosudo commented 6 years ago

@mattisback have you already implemented something following Nils's advice? Otherwise I'm going to work on it now

bebosudo commented 6 years ago

I tried to implement a new version on multiple endpoints (a6832bf72737eac169c5c01ed550aa385123d747), and I created two working examples of the ticker method on bitfinex and poloniex (I'm just a bit unsure about whether to use the quoteVolume or the baseVolume field on the Poloniex ticker endpoint: https://github.com/Crypto-toolbox/bitex/commit/a6832bf72737eac169c5c01ed550aa385123d747#diff-2f3b32f9747710655dbbd0c60f1a223eR27 ). I think discussion of the implementation should go into the related PR #140.

Edit: if you think this implementation is reasonable, I'll implement the formatters for all the other exchanges.

bebosudo commented 6 years ago

Since I'm working on a project, and I needed a standard way to access the ticker of many exchanges, I've implemented a dozen more exchange formatters in 0f389a6b993569e9c4ee04e006a82b0e791286b8, available inside PR #140.

All exchanges now return a standard tuple with these values: ("bid", "ask", "high", "low", "last", "volume", "timestamp") except for CCEX (https://github.com/Crypto-toolbox/bitex/pull/140/commits/0f389a6b993569e9c4ee04e006a82b0e791286b8#diff-bd2d7bf5a69e3d37d630f9911debdf42R16), which doesn't return volume metrics (should we still provide a formatter to CCEX even though it doesn't have volumes?).

Regarding the exchanges that didn't return the timestamp in the request, I returned a utc datetime of the local machine. Moreover, I don't know whether timestamp is appropriate, since these objects are datetime aware objects: if you have any hint for a better parameter name, feel free to suggest.

Feel free to provide suggestions!

Edit: Kraken is down for maintenance, when it will come back I'll add its formatter.

deepbrook commented 6 years ago

Congrats, you've encountered a perfect example for why I initially opted for dropping the formatter idea :'D I would define some sort of default value for data which is not acquirable (the most obvious one I can think of is None or N/A or similar) and pass that instead of it.

As for the timestamp, it sort of depends on how exact we want the data to be. Substituting a non-present timestamp from the server with a unix timestamp (as an example) can be anything from ping time and more off. Personally, I'd say that's OK - if you want real-time data, REST isn't the way to go to begin with so this shouldn't be much of an issue. However, if we want the data to be as exact as possible, adding another time field (as mentioned above) to the returned data seems like the way to go. It wouldn't necessarily have to be present in the tuple - we could also add it as an attribute to the Response object.

Also, I've seen you're returning some data as decimal.Decimal classes - I'd like the data to be strings only, and leave conversion to the user.

bebosudo commented 6 years ago

I would define some sort of default value for data which is not acquirable (the most obvious one I can think of is None or N/A or similar) and pass that instead of it.

In fact I'm already using None in my implementation, as you can see here: https://github.com/Crypto-toolbox/bitex/commit/0f389a6b993569e9c4ee04e006a82b0e791286b8#diff-bd2d7bf5a69e3d37d630f9911debdf42R28

adding another time field (as mentioned above) to the returned data seems like the way to go. It wouldn't necessarily have to be present in the tuple - we could also add it as an attribute to the Response object.

I liked the idea of the timestamp being inside the tuple returned from the formatter, so that everything needed is contained inside that tuple, but we can move the timestamp outside, on the response object, if you prefer so. I missed the part of your comment above where you suggested to use two different timestamps, one from the API, which could be None if it's not provided, and another one given by the library, which should always be present.

I'd like to discuss also what we want to return: a plain unix timestamp (an int, a string, a float?) or a datetime aware object? I'm inclined for the latter. (Now I remember that in my implementation only a couple of timestamps in exchange formatters are datetime aware, most of them are naive: I need to fix it in case this is approved).

Also, I've seen you're returning some data as decimal.Decimal classes - I'd like the data to be strings only, and leave conversion to the user.

As I discussed with @mattisback in issue #138, the problem is that many exchange return floats inside their json, and interpreting them as Decimal is the most conservative option: if we convert a json to string inside our formatter method, we're automatically introducing errors in the data during the conversion. I personally believe that this could increase awareness into the end user ("why are they using this strange Decimal class instead of a simple float?") and moreover if we apply either strings or decimal conversion, the user would need to convert it anyway to float, or to their favorite class, but at least bitex would not introduce any error. But I don't want to force this opinion, if you prefer to use strings, I'll adapt the formatters.

Looking forward to close this issue and see it in an officially released package in pip ;-)

deepbrook commented 6 years ago

I liked the idea of the timestamp being inside the tuple returned from the formatter, so that everything needed is contained inside that tuple, but we can move the timestamp outside, on the response object, if you prefer so. I missed the part of your comment above where you suggested to use two different timestamps, one from the API, which could be None if it's not provided, and another one given by the library, which should always be present.

No worries - I, too, would say a timestamp in the return value is necessary; it just needs to be obvious which is which.

I'd like to discuss also what we want to return: a plain unix timestamp (an int, a string, a float?) or a datetime aware object? I'm inclined for the latter. (Now I remember that in my implementation only a couple of timestamps in exchange formatters are datetime aware, most of them are naive: I need to fix it in case this is approved).

It should be a str (again, for compatibility purposes). What this looks like though, is another question. Perhaps strftime'd to "%Y-%m-%d %H:%M:%S:%f" ?

problem is that many exchange return floats inside their json, and interpreting them as Decimal is the most conservative option: if we convert a json to string inside our formatter method, we're automatically introducing errors in the data during the conversion.

values in a json string are strings before they are converted -therefore, the most conservative way to convert them is to a str. While you are correct that converting them to float (which is the default behaviour when parsing) introduces possible errors due to overflows, this doesn't happen with str (and, as you correctly pointed out) decimal.Decimal().

Essentially, what happens under the hood is this:

# Take a sample json string
json_str = '{"price":1.909420}'

# when calling json.loads(), the json module parses the string for values to convert - when reaching our float it does the following:
parsed_value = selected_parser_parser(json_str[9:17])

While it's a good thing you want to spread the word about the possibility of overflow errors when converting json strings to float, I don't believe it to be our job to do so.

Furthermore, complex calculations by professionals would probably be done using numpy or similar. In that case, however, they'd have to convert back from decimal.Decimal() to string, because numpy doesn't support decimal.Decimal as a dtype. So keeping the return values as strings is both the most accurate, as well as most agnostic way to return data which was initially passed as json-ified float.

bebosudo commented 6 years ago

I'd like to discuss also what we want to return: a plain unix timestamp (an int, a string, a float?) or a datetime aware object? I'm inclined for the latter. (Now I remember that in my implementation only a couple of timestamps in exchange formatters are datetime aware, most of them are naive: I need to fix it in case this is approved).

It should be a str (again, for compatibility purposes). What this looks like though, is another question. Perhaps strftime'd to "%Y-%m-%d %H:%M:%S:%f" ?

Ok for the string type of the object. I'm not sure I get the point of converting a unix timestamp, which is universal since it's seconds since 1970-01-01 00:00:00 UTC, into a string encoded with yet another standard which then needs to be strftime'd back in order to be able to work on it, and which looses timezone information. I'd prefer to give a datetime aware object, something like:

>>> from datetime import datetime
>>> import pytz
>>> 
>>> utc = pytz.utc
>>> utc.localize(datetime.utcnow())
datetime.datetime(2018, 1, 14, 15, 19, 18, 26209, tzinfo=<UTC>)

at least in the client_ts attribute. And then leave the api_ts, when available, as unix timestamp, saved as string. What do you think?


So keeping the return values as strings is both the most accurate, as well as most agnostic way to return data which was initially passed as json-ified float.

Ok, then I suggest we at least use Decimal as intermediary step before providing the data as strings to the user, instead of stringifying a float. So, something like this:

    data = self.json(parse_int=Decimal, parse_float=Decimal)

    return TickerFormattedResponseTuple(bid=str(Decimal(data["Bid"])),
                                        ask=str(Decimal(data["Ask"])),
                                        high=str(Decimal(data["High"])),
                                        ...

Do you agree?

deepbrook commented 6 years ago

Ok, then I suggest we at least use Decimal as intermediary step before providing the data as strings to the user, instead of stringifying a float. [...] That doesn't make sense. As said before, json passes the raw string to the constructor of the type you pass in parse_float - per default it's the float class. So it would call float(json_sub_string) on anything that looks like a float. If you pass parse_float=str it instead calls str(json_substring), which does zero conversion and is absolutely loss-free. Using your method would only take longer and have the identical result.

I'm not sure I get the point of converting a unix timestamp, which is universal since it's seconds since 1970-01-01 00:00:00 UTC, into a string encoded with yet another standard which then needs to be strftime'd back in order to be able to work on it, and which looses timezone information.

We can leave it as unix timestamp - but that's not very readable. Hence I'd strftime it - besides this format is well known and supported widely by many tools for storing and reading time series and related data (which can parse it easily from a string). As for the timezone information, that's a valid point but can be avoided using using %z which denotes the offset from UTC as HH:MM.

bebosudo commented 6 years ago

If you pass parse_float=str it instead calls str(json_substring), which does zero conversion and is absolutely loss-free.

ok, never tried with str, hence I thought it could not directly be "parsable".

Regarding the timestamps, then you suggest separating them from the response class, and add them as new properties, right? I think that we should stick to a standard, if we want to provide time as a string, e.g. ISO8601 May I suggest creating a string version, and a separate datetime version? So that client_ts_iso8601 is built with:

>>> import datetime
>>> datetime.datetime.utcnow().replace(tzinfo=datetime.timezone.utc).isoformat()
'2018-01-14T16:24:33.730384+00:00'

and client_ts_datetime is a datetime aware object:

>>> import datetime, pytz
>>> pytz.utc.localize(datetime.datetime.utcnow())
datetime.datetime(2018, 1, 14, 16, 41, 35, 288247, tzinfo=<UTC>)
deepbrook commented 6 years ago

client_ts_iso8601 is cumbersome, but the general idea is good. perhaps, instead do this: add an attribute received_at_dt to FormattedResponse, which stores the aware datetime object, and a property as follows:

@property
def received_at(self):
    return self.recv_at_dt.strftime('%-Y%m-%d %H:%M:%S.%f %z')

Or similar. What do you think?

bebosudo commented 6 years ago

To me is fine creating the two parameters inside FormattedResponse.

%-Y%m-%d %H:%M:%S.%f %z

Is there a reason you want to use that particular format instead of the ISO one?

deepbrook commented 6 years ago

It was my impression that %-Y%m-%dT%H:%M:%S.%f+%z creates this 2018-01-15T14:56:02.000+00:00

is that not ISO? If not, I leave the formatting to you then :D

edit: I just saw I used a different format string in my previous comment - either way, I'm absolutely ok with using the ISO format - I just didn't pay attention to the format I used.

bebosudo commented 6 years ago

Unfortunately %z is UTC offset in the form +HHMM or -HHMM (empty string if the the object is naive), but the ISO8601 standard needs a colon between hours and minutes.

I'd just like to avoid the "yet another standard" approach: https://xkcd.com/927/ ;-)

I'll implement the received_at attributes inside FormattedResponse, using str as parser.

Edit: I'm implementing it, but I don't see the need to create two custom properties. Placing them right inside the __init__ of the AbstractFormattedResponse class is enough now: if in the future these should become more "complex" functions, they can be moved to property methods.

class AbstractFormattedResponse(requests.Response):

    def __init__(self, method, response_obj, *args, **kwargs):
        ...
        self.received_at_dt = datetime.datetime.utcnow().replace(tzinfo=datetime.timezone.utc)
        self.received_at = self.received_at_dt.isoformat()