Closed Tzumx closed 1 year ago
Hi @Tzumx
Thanks for the detailed report!
I agree this is probably being caused by a strategy_id
which cannot be recovered from the cache for some reason. We recently had a large refactoring of the Binance adapter which is pending release very soon.
Are you able to try your testing again from v1.169.0 on the develop
branch? Alternatively I hope to have this version released sometime this week.
If you find the same behavior then I can dig further.
The new version is now released.
I'm just after a little more information, is your strategy purely automated - or are you also sending external orders from a GUI or from a source external to Nautilus?
The new version is now released.
I'm just after a little more information, is your strategy purely automated - or are you also sending external orders from a GUI or from a source external to Nautilus?
Sorry for the late answer: Yes, strategy is automated fully and all orders are executed only from internal logic
Thanks for the new version, we are going to test on it
Hi Tzumx,
Did you manage to get any further with your testing?
I think I have a good idea of why this occurs, its a race condition between submitting the order and during the round trip latency the orders in-flight check loop detects an order in-flight and also requests status. Then the order status report comes back and generates a fill event along with the event coming in over the websocket.
To test this theory you could turn of the inflight checks:
LiveExecEngineConfig(inflight_check_interval_ms=0)
Or, increase the inflight_check_threshold_ms
(I've actually bumped up the default to 2_000 ms now).
I think the make this even more robust so there's no race condition, we need to check if any trade ID has already been processed. I'll have think about it.
Thanks for your investigation in that question, now I'm running on last version - no bug still, but it happens not often, so I'll continue to observe and for sure will try your advice
So we still think the above is the most likely explanation, also I originally set the two config defaults back to front. It should check with the exchange more often, but allow a larger threshold before actually attempting a reconciliation - as there is a race condition here which could be caused by any number of things.
Note the new defaults and improved docs:
class LiveExecEngineConfig(ExecEngineConfig, frozen=True):
"""
Configuration for ``LiveExecEngine`` instances.
The purpose of the in-flight order check is for live reconciliation, events
emitted from the exchange may have been lost at some point - leaving an order
in an intermediate state, the check can recover these events via status reports.
Parameters
----------
reconciliation : bool, default True
If reconciliation is active at start-up.
reconciliation_lookback_mins : NonNegativeInt, optional
The maximum lookback minutes to reconcile state for.
If ``None`` or 0 then will use the maximum lookback available from the venues.
inflight_check_interval_ms : NonNegativeInt, default 2_000
The interval (milliseconds) between checking whether in-flight orders
have exceeded their time-in-flight threshold.
This should not be set less than the `inflight_check_interval_ms`.
inflight_check_threshold_ms : NonNegativeInt, default 5_000
The threshold (milliseconds) beyond which an in-flight orders status
is checked with the venue.
As a rule of thumb, you shouldn't consider reducing this setting unless you
are colocated with the venue (to avoid the potential for race conditions).
qsize : PositiveInt, default 10_000
The queue size for the engines internal queue buffers.
"""
reconciliation: bool = True
reconciliation_lookback_mins: Optional[NonNegativeInt] = None
inflight_check_interval_ms: NonNegativeInt = 2_000
inflight_check_threshold_ms: NonNegativeInt = 5_000
qsize: PositiveInt = 10_000
So closing this for now unless you discover this continues to occur with the new settings (in which case please re-open).
Bug Report
We did a lot of testing of the platform recently and executed big amount of orders, and most of them were handled properly, but sometimes (rarely, once per 20-60 orders), we see some unusual behavior that results in incorrect position calculation and brakes our logic.
Expected Behavior
We send our market order and it executed and changes internal position only on real quantity it was filled
Actual Behavior
Sending order sometimes changes internal position twice. For instance, long ETH with 1 token may be calculated as a size of 2 tokens in the nautilus position, even though on the exchange side it was only 1 token.
Steps to Reproduce the Problem
Logs:
We did some investigation, and our guess is this:
inferred event is generated from
_generate_inferred_fill
←_reconcile_order_report
←_reconcile_report
which through msgbus getting from_send_order_status_report
←_generate_external_order_report
from [execution.py](https://github.com/nautechsystems/nautilus_trader/blob/97ee22e98afa0017faeeb2cdd325e65dc6eb2514/nautilus_trader/adapters/binance/spot/execution.py#L743)So it could be because of somehow missed strategy_id for that order in the cache.
Let us know if you need more details, we are still in the process of investigating and trying to fix this issue, happy to provide more info if you have any ideas of what it can be.
Specifications
nautilus_trader
version: version: 1.168