nautechsystems / nautilus_trader

A high-performance algorithmic trading platform and event-driven backtester
https://nautilustrader.io
GNU Lesser General Public License v3.0
2.17k stars 488 forks source link

Using Cython compiler directives to detect bugs/improvements/corrections #283

Closed igorgodel closed 3 years ago

igorgodel commented 3 years ago

Using Cython compiler directives to detect bugs/improvements/corrections

Pay attention to the possibility to detect bugs/improvements/corrections using Cython compiler directives

# build.py
CYTHON_COMPILER_DIRECTIVES = {
    ...    
    # Warns about use of variables that are conditionally uninitialized.
    # (default: False)
    # "warn.maybe_uninitialized": True - allows to detect many bugs and corrections.
    "warn.maybe_uninitialized": True,
    # Warns about unused variables and declarations.
    # (default: False)
    # "warn.unused": True - allows to detect many bugs and corrections.
    "warn.unused": True,
    # Warns about unused function arguments.
    # (default: False)
    # "warn.unused_arg": True - allows to detect many bugs and corrections.
    "warn.unused_arg": True,
    # Warns about unused assignment to the same name.
    # (default: False)
    # "warn.unused_result": True - allows to detect many bugs and corrections.
    "warn.unused_result": True,

}

Only some examples:

1) "warn.maybe_uninitialized" directive

Setting the directive to 'True' allows to identify next bugs:   

    file:   \adapters\ccxt\execution.pyx
    method: CCXTExecutionClient.generate_order_status_report
    bug:    variable 'state' is uninitialized for all possible cases,
            append 'raise' for unsupported 'status' types

    file:   \backtest\data_producer.pyx
    method: CachedProducer.next
    bug:    variable 'data' is uninitialized for 'None' if there are no data

    file:   \backtest\engine.pyx
    method: BacktestEngine.__init__
    bug:    variable 'data_client' is not defined for 'all' client_type ...,
            append 'raise' for unsupported 'client_type' types

    file:   \backtest\exchange.pyx
    method: SimulatedExchange.process_order_book
    bug:    variables 'bid' and 'ask' are not defined for all possible cases,
            append 'raise' for unsupported 'OrderBookData' types

    ... 

NEED the detail analysis for each warning !!!

2) "warn.unused" directive

Setting the directive to 'True' allows to identify next 'bugs':   

    file:        \backtest\exchange.pyx
    method:      SimulatedExchange._build_current_bid_rates
    warning:     "Unused entry 'instrument_id'" 
    improvment:

        # current version:
        cdef inline dict _build_current_bid_rates(self):
           cdef InstrumentId instrument_id
           cdef QuoteTick tick
           return {instrument_id.symbol.value: price.as_decimal() for instrument_id, price in self._market_bids.items()}

        # new version 1:
        # declaring explicitly Cython's variables has no effect in this case 
        cdef inline dict _build_current_bid_rates(self):
           return {instrument_id.symbol.value: price.as_decimal() for instrument_id, price in self._market_bids.items()}

        # new version 2:
        # Use explicitly Cython's variables and 'for' 
        # Can really speed up code execution to build rates for xrate_calculation!!! 
        cdef inline dict _build_current_bid_rates(self):
           cdef InstrumentId instrument_id
           cdef Price price
           cdef dict rates = dict()
           for instrument_id, price in self._market_bids.items():
               rates[instrument_id.symbol.value] = price.as_decimal()

           return rates

    file:        \data\engine.pyx
    method:      DataEngine._handle_bars
    warning:     "Unused entry 'TimeBarAggregator'" 
    improvment:  declare the target variable

        # current version:
        ....
             cdef TimeBarAggregator
        ....

        # new version:
        ....
             cdef TimeBarAggregator aggregator
        ....

    ... 

NEED the detail analysis for each warning !!!

3) "warn.unused_arg" directive

Setting the directive to 'True' allows to identify next improvements:   

    file:        \indicators\atr.pyx
    method:      AverageTrueRange.__init__
    warning:     "Unused argument 'check_inputs'" 
    improvment:  remove arg 'check_inputs'

    file:        \model.instrument.pyx
    method:      Instrument.__init__
    warning:     "Unused argument 'financing'" 
    improvment:  remove arg 'financing' or define the correspondent data member ... 

    ... (there are tons ща false warnings) 

NEED the detail analysis for each warning !!!

4) "warn.unused_result" directive

Setting the directive to 'True' allows to identify next improvements:   

    file:        \execution\engine.pyx
    method:      ExecutionEngine.__init__
    warning:     "Unused result in 'config'" 
    improvment:  remove arg 'config' or define the correspondent data member ... 

    file:        \risk\engine.pyx
    method:      RiskEngine.__init__
    warning:     "Unused result in 'config'" 
    improvment:  remove arg 'config' or define the correspondent data member ... 

    ... 

NEED the detail analysis for each warning !!!

Thanks

cjdsellers commented 3 years ago

Excellent pickup on these compiler directives! I'll turn them on and investigate the results.

ian-wazowski commented 3 years ago

Great!

cjdsellers commented 3 years ago

I'm turning them on one at a time and fixing now.

cjdsellers commented 3 years ago

All warn.maybe_uninitialized have been fixed.

All warn.unused have been fixed except some generator expressions which I can't currently figure out how to modify to keep the compiler happy (so left this directive commented for now).

I fixed the two params in your point (3), however the compiler is picking up too many legitimate uses of self in method signatures for that to be a useful directive to me. Please advise if you find any more redundant parameters yourself though.

I have fixed the items in your point (4). I'll leave the directive commented and investigate more closely at a later date.

igorgodel commented 3 years ago

Yes, the compiler is picking up too many legitimate uses ...
Yes, leaving the directive commented is the best solution (for now) ...

After the release of the new version I'll try to view all warns

There is another question warning C4244: .. : conversion from 'int64_t' to 'double', possible loss of data warning C4244: .. : conversion from 'Py_ssize_t' to 'int', possible loss of data ... Sometimes it contains serious bugs and sometimes - not needed conversions. For example: "build/optimized\nautilus_trader\execution\cache.c(15163): warning C4244: '=': conversion from 'int64_t' to 'double', possible loss of data" If we open \nautilus_trader-1.116.1\nautilus_trader\execution\cache.pyx method: ExecutionCache.check_integrity we can find that implicit conversion int64_t to double is not necessary:

# current version
cdef double timestamp_us = unix_timestamp_us()
...
cdef int64_t total_us = round(unix_timestamp_us() - timestamp_us)

# new version
cdef int64_t timestamp_us = unix_timestamp_us()
...
cdef int64_t total_us = unix_timestamp_us() - timestamp_us
cjdsellers commented 3 years ago

If you're able to review these and tell me what needs to be changed that would be much appreciated :1st_place_medal:

cjdsellers commented 3 years ago

@igorgodel Current progress update on this issue:

I did find and fix some type conversion warnings, I'll keep running the check periodically and fixing as I go as each one requires some careful thought.

All warn.maybe_uninitialized warnings have been fixed.

I've had another look at warn.unused_result however it seems to pick up variables which are actually used, sometimes even on the next line. This warning class then possibly isn't something we could fix/incorporate into the build system ongoing?

I fixed some remaining warn.unused warnings, however there still remains the following 2:

Error compiling Cython file:
------------------------------------------------------------
...
                    "sort_priority": r.get("sortPriority"),
                }
                for r in market_definition["runners"]
            ],
        }
    if all(k in market_definition for k in ("eventType", "event")):
            ^
------------------------------------------------------------
nautilus_trader/adapters/betfair/providers.pyx:227:13: Unused entry 'genexpr'
Error compiling Cython file:
------------------------------------------------------------
...
        self._log.info(f"Pre-processing data stream...")

        if self._stream:
            # Set data stream start index
            self._stream_index = next(
                idx for idx, data in enumerate(self._stream) if start_ns <= data.ts_recv_ns
               ^
------------------------------------------------------------

nautilus_trader/backtest/data_producer.pyx:349:16: Unused entry 'genexpr'

I've had a go, however its not immediately obvious to me how to 'fix' these two warnings, save for rewriting the functions themselves.