MicroTrendsLtd / NinjaTrader8

NinjaTrader8 Components Strategies and Trading tools
MIT License
69 stars 16 forks source link

Third of Three Issues - Rejected - order Cancel Pending status #6

Closed jmscraig closed 3 years ago

jmscraig commented 3 years ago

Greetings! Happy Monday. I have 20+ of hours test execution completed against the latest code posted here using my development environment overnight processing with live market date on a Sim account on a VPS during high volume daytime hours.

Completed thousands of successful trade executions but surfaced three issues.

1) Deadlocks 2) LIST<> read write index related conflicts, 3) ChangeOrder submission rejects because the order is in “Cancel Pending” status

3) ChangeOrder submission rejects because the order is in “Cancel Pending” status

Resolving this one seems pretty simple .. I will see how far checking Order status to == Active or Working just prior to submission.

Just thinking out loud... If that does not work and we find that the Popup error messages have a connection with NT8 / strategy stability then at what point do we consider a transition to setting error handling to "IgnoreAllErrors"

Your thoughts? Code fixes to paste in? Better Ideas? James

MicroTrendsTom commented 3 years ago

yes a simple check should suffice at that point and should be implemented etc.

MicroTrendsTom commented 3 years ago

"Resolving this one seems pretty simple .. I will see how far checking Order status to == Active or Working just prior to submission." I will wait for your feedback so i wont duplicate code/work etc

MicroTrendsTom commented 3 years ago

"at what point do we consider a transition to setting error handling to "IgnoreAllErrors"" i think this can work also, if we are covering rejects and other anomalies we dont need to see them popup etc

jmscraig commented 3 years ago

Update on proposed Cancel Pending fixes and hunting for deadlock root cause

One of the two test machines is still running wonderfully and has completed buy and sell transactions on 4800 ES contracts without a single issue. This machine only has three of of the parms bools turned on.. DEBUG - IsTracingMode, FlattenOnTrasition and IsStrategyUnsafeMode.

The second machine completed buy and sell transactions on 4500 NQ contracts ran until 11:59 PM when NT8 crashed and did a full and complete hard unplanned shutdown. During the 4500 contracts there were only two sets of popups with rejections due to Order Cancel Pending Status.

So significant improvement reducing popups with rejections due to Order Cancel Pending Status. Yea a win!

Maybe you know.. I am hoping the NT8 crash had something to do with Notepad's inability to handle the size of the ATS.NT8 log files (see attachments). I just installed Notepad++ to clear this problem. TooBigForNotepad NotePadFilesOpenAfterClosing10

Checking the log files to see if I can find cause of the crash

jmscraig commented 3 years ago

Most recent ATS.NT8 log entry .. Wow.. Crashed right at Midnight local time. Last log entry was 522 Milliseconds before Midnight.

In this file AlgoSystemBase looks like might have been in 'waiting state during crash so not much to see here...

2020.11.29 23:59:59:951 Sim101 jcNQSlowQuadro ES 12-20 CurrentThread.Name.?|DS= ES 12-20 (6 Tick)|BT=2020.11.29 23:59:58:488|HR=R|CB=11536|LC=3614.75|RX=52|RO=1501|MP=Short|PQ=4|AO=8|WF=Waiting|S=Realtime|: OnOrderUpdate(↑Trg3#348 OrderId=0865346c4c8d40e5a948bc93699ac610 State=Accepted)

2020.11.29 23:59:59:952 Sim101 jcNQSlowQuadro ES 12-20 CurrentThread.Name.?|DS= ES 12-20 (6 Tick)|BT=2020.11.29 23:59:58:488|HR=R|CB=11536|LC=3614.75|RX=52|RO=1501|MP=Short|PQ=4|AO=8|WF=Waiting|S=Realtime|: OnOrderUpdate(↑Trg3#348 OrderId=0865346c4c8d40e5a948bc93699ac610 State=Working)

I notice this file has the Instrument and Bar Type and Period already listed Cool.

jmscraig commented 3 years ago

The NT8 trace file shows the VPS just ran of our memory at midnight. All the logging must have been too much for it. Good news that there is no sign of any Algo system problems.

2020-12-01 23:56:53:086 unhandled exception trapped 2020-12-01 23:56:53:087 Insufficient memory to continue the execution of the program. 2020-12-01 23:56:53:087 System.OutOfMemoryException: Insufficient memory to continue the execution of the program.

MicroTrendsTom commented 3 years ago

wow thats a new one i wonder if the notepad popup was the cuase - i will make that popup thiny a parameter option

MicroTrendsTom commented 3 years ago

in fact i will make a new issue for that and we can confirm this one is ok?

MicroTrendsTom commented 3 years ago

Notepad will popup when an error rejection occurs -and so that last chunk of text at the bottom will be the last error etc

Anyway i will leave this open until you are 100% sure. you can add a Pull Request if you like for the sample strat unless you are using a different etc

jmscraig commented 3 years ago

Morning.

I am using a evolved copy of the sample strategy with a collection of notes and the changes I am testing.

I will populate favored changes into a clean version of the sample strategy and submit a pull request.

jmscraig commented 3 years ago

Both ran through to the morning without deadlocking. Both had fatal errors and were disabled this morning. MES with UnSafe mode enabled was disabled at ~6:45am and MNQ at 7:01:55.

There were significant (not unusual) market moves at these time and also not unusually high volume.

Both fatal crashes were unusually similar. Both started

jmscraig commented 3 years ago

Ideas on Fixes

jmscraig commented 3 years ago

The first two attachments are from the client that did NOT have UnSafe enabled

Quadro-Fatal-Disabled-UnSafe-was-Not-Enabled-2020-12-02

Quadro-Fatal-Disabled-UnSafe-was-Not-Enabled-OutPut TAB 2 -- 2020-12-02

jmscraig commented 3 years ago

This third attachment is from the client that had UNSAFE ENABLED.

During this test the error Unable to verify ErrorFlattenAllPending was the only visible difference between between running UnSafe enabled and not. Quadro-Fatal-Disabled-UnSafe-ENABLED-2020-12-02

jmscraig commented 3 years ago

Your thoughts?

jmscraig commented 3 years ago

"wow thats a new one i wonder if the notepad popup was the cuase - i will make that popup thiny a parameter option"

I saw the issue you created and closed for this. Looks good.

jmscraig commented 3 years ago

Another Fix idea.

Wrapping the internals of CancelAllOrders() in a try-catch might have prevented this error from being fatal and having the catch allows capabilities for better logging and possibly a direct response to the error.

image

jmscraig commented 3 years ago

I am going to populate some of those ideas in my local copy of AlgoSystemBase and test them..

Still like to hear your thoughts and ideas.

jmscraig commented 3 years ago

FYI.. I intend to run overnight tests against the proposed fixes for Rejected - order Cancel Pending status -- integrated into the AlgoSystemBase you commited today.

MicroTrendsTom commented 3 years ago

Your thoughts? i also need to conduct some tests to assess this flatten all etc... Sounds like exit orders were cancelled midlfight and gave the cannot be submitted error etc The object ref error needs trapping the cycle between cancels is decide bu the retry mechanims - in theory

i need to test similar and understand the context a bit better etc

jmscraig commented 3 years ago

"i need to test similar and understand the context a bit better etc"

I have a simple timed delay solution to test and submit for this.. was trying to do that last night integrated into yesterdays commits but it keep crashing.. will check on using todays commits

jmscraig commented 3 years ago

Testing todays release on two machines with only parameter change being to reduce Fast and Slow MA length to keep the workflow engine busy..

If that runs well will migrate in proposed fixes for CancelPending Rejects and test those.

jmscraig commented 3 years ago

No issues yet. Now testing the proposed CancelPendingFix in one of two running test clients

jmscraig commented 3 years ago

Regarding proposed fixes to reduce the number of order rejections due to order status of Cancel Pending.


Simple is good. With the simple updates in the code below I am trying to address two causes of order rejects due to Cancel Pending Order Status.

Note: Order.Status does not update quickly enough to be a reliable identify of CancelPending order status.

In order to restrict unwanted work flow actions I decided to to stay in genre for patterns already appearing this code and go with/propose simple timer based delays.

  1. flattenOrCancelOrdersInitiatedNoticeEndTime (DateTime var) - create that flag to be used at end workflow action points (E.g. TradeManagement() ) that ensures a prevention of ChangeOrder submissions while we Flatten, Close Position, Cancel Orders actions under way.

  2. cancelOrdersSubmitDelayEndTime (DateTime var) - intended to ensure a short dealy exists between CancelOrders submissions to allow the exchange facing servers time enough complete the cancels and or get status order updates populated through the systems to our following CancelOrder calls never take place or if they do generate fewer errors.

The intent was to first test 400ms delay to give OCOs what usually should be plenty of time to cancel, (MAX of 200ms per side).

cancelOrdersSubmitDelayEndTime = DateTime.Now.AddMilliseconds(400);

The workflow engine goes into a loop generating far more orders than intended.

Even when set to 1ms the large unintended SubmitOrder loops occur cancelOrdersSubmitDelayEndTime = DateTime.Now.AddMilliseconds(1);

GoLongSubmitOrderWorking:> OnOrderUpdate(↓Trg GoLongSubmitOrderWorking:> OnOrderUpdate(↓Trg GoLongSubmitOrderWorking:> OnOrderUpdate(↑Stp

or

GoShortSubmitOrderWorking:> OnOrderUpdate(↑Trg GoShortSubmitOrderWorking:> OnOrderUpdate(↓Stp GoShortSubmitOrderWorking:> OnOrderUpdate(↑Trg

I noticed the logical content of the cases looping is pretty thin:

"case StrategyTradeWorkFlowState.GoLongSubmitOrderWorking: TradeWorkFlowOnMarketDataDisable(); break;"

code diff image

image

image

So my three questions are:

**1. Do you like and want to use the flattenOrCancelOrdersInitiatedNoticeEndTime and cancelOrdersSubmitDelayEndTime concepts?

  1. If so how to we make them well with the workflow engine?
  2. Better ideas? Existing solutions?**
jmscraig commented 3 years ago

The full ATS.NT* tracefile that goes with those images.

ATS.NT8.20201203.Trace-before-Manual-lntervention.txt

jmscraig commented 3 years ago

Note: Unsafe was not enabled

How I had expected this would work:

What I don't see in the log file and what I was expecting to see is:

StrategyTradeWorkFlowState transition from GoShortCancelWorkingOrders to GoShortCancelWorkingOrdersPending prior to the call to CancelAllOrders().

The Trace log does not show this transition from GoShortCancelWorkingOrdersPending where expected.

Then ...

If a call to CancelAllOrders() had taken place to CancelAllOrders() in the last millisecond the delay timer if(..) in CancelAllOrders() would returned the call early with no execution.

Upon return of the call to case GoShortCancelWorkingOrders the following logic would be enabled driving retries or eventually an error.

TradeWorkFlowOnMarketDataEnable(); //will continue to loop back here forever unless we have a timeout tradeWorkFlowRetryCount++; if (tradeWorkFlowRetryCount > tradeWorkFlowRetryAlarm) return ProcessWorkFlow(StrategyTradeWorkFlowState.Error);

One might say, 'Sell all those rows state labeled as GoShortCancelWorkingOrders?
'Yabutt' by then State should have transitioned to GoShortCancelWorkingOrdersPending and it has not. Also, the timer only delayed action for 1m, state remains the same in the log for 4000ms.

I could force a timer test into each case that calls it, but that is not quality code work so decided to check in with you and get your advice.

===================================================

          case StrategyTradeWorkFlowState.GoShortCancelWorkingOrders:
                if (IsHistoricalTradeOrPlayBack || IsStrategyUnSafeMode)
                {
                    CancelAllOrders();
                    TradeWorkFlow = StrategyTradeWorkFlowState.GoShortCancelWorkingOrdersConfirmed;
                    goto case StrategyTradeWorkFlowState.GoShortCancelWorkingOrdersConfirmed;
                }
                else
                {
                    if (connectionStatusOrder == ConnectionStatus.Connected)
                    {
                        TradeWorkFlow = StrategyTradeWorkFlowState.GoShortCancelWorkingOrdersPending
                        CancelAllOrders();
                    }
                    TradeWorkFlowOnMarketDataEnable();
                    //will continue to loop back here forever unless we have a timeout
                    tradeWorkFlowRetryCount++;
                    if (tradeWorkFlowRetryCount > tradeWorkFlowRetryAlarm)
                        return ProcessWorkFlow(StrategyTradeWorkFlowState.Error);
                }
                break;
MicroTrendsTom commented 3 years ago

So my three questions are:

  1. Do you like and want to use the flattenOrCancelOrdersInitiatedNoticeEndTime and cancelOrdersSubmitDelayEndTime concepts?
  2. If so how to we make them well with the workflow engine?
  3. Better ideas? Existing solutions?

Do you like and want to use the flattenOrCancelOrdersInitiatedNoticeEndTime and cancelOrdersSubmitDelayEndTime concepts? Yes but in a calling method or an override of the cancelAllOrders etc

Note: This was never necessary in the past trade engines from which this was roughly hewn and migrated to NT8 something must be at fault to do with locking and event processing or the timing/sequence of calls/....

Yes: OrderStates are unreliable - in fact to measure them the real way to do it is wait for the exchange message back - Heisenberg style. Alt: We can look to see where price is relative to an order and decide to cancel or change it or go ahead and see what message comes back etc

catch up ok i now caught up with this thread. Sorry i missed that. so there is a lot of changes to main -that might have assisted this... So far no errors this end ran all night in NQ 10 tick sma 5,12 etc but one deadlock caught in the debugger - and rectified. So a retest and then look at the above caveats...etc

Modes - horses for courses is one item to think of ...etc
if you want to squeeze more out the engine then unsafe mode with some lightweight process and some post trade fix/monitor

  1. or simply a price engine to calculate the future reversal of thje MA and place a stop entry at that point in time when price gets near etc
  2. or indeed price/order proximity handling for different patterns
  3. and fast market patterns

This testing is unrealistic to the application of the system it would be impossible to make a profitable system with reversals in that series with a retail system - for each trade the CQG system has to do a lookup across the network for pre trade validation and more besides -anything needing less that 1 second as critical in retail trading is probably doomed - to process submit and so reverse... 1 seconds to submti and get all exits in place is more realistic then unlock to accpet a trade and so within 2 to 3 secs on average all itesm are good is 99% of all the normal needs.

beyond this a change of approach is vital... very simpls a mode such as: stop entry orders in the market - and fast closes - by using the limit orders etc

or or exit close wth target limits to move past price etc

What is the safe operating mode and Benchmark? Ok so if this is the benchmark which it goes crash bang wallop - what is the benchmark where it is in safe bounds? This rope snaps at 100KG what is its safe operation ? 10KG to 50KG etc? a realistic test taht could stand a chance of being tenable in trading $$$ terms must also undertaken

WORKFLOW LOOP Worfklow loop is alarming iver not see this or replicated it - that might need to be a new bug as this one was about order rejections or is this part of that?

So it this what is happening? So an order rejections loop caused by

a call to cancelled and items are locked new orders placed the cancel going through and cancelling them as they are submitted as well as the items to be cancelled?

MicroTrendsTom commented 3 years ago

Adding a close with target mode.... for positonclose

jmscraig commented 3 years ago

I should have posted this comment here. https://github.com/MicroTrendsLtd/NinjaTrader8/issues/18#issuecomment-738866516

running with out 1 hitch in 1 year i do use small fine grain locks such as lock (account.Positions)"

Great!

I have been reading on this since I got up.

After trying to ferret out the object model I believe we should test

lock (Account.Orders) Account.CancelAllOrders(this.Instrument);

and if that does not work then capitulate and test.

lock (Account.All) Account.CancelAllOrders(this.Instrument);


Pull down last nights commits now

MicroTrendsTom commented 3 years ago

lock (Account.Orders) ok good

MicroTrendsTom commented 3 years ago

A Question on locking basics my brains is down to one brain cell the other is on strike

public List OrdersActive { get { lock (ordersRT) return ordersRT; } }

so if we now do a lock somewhere else bool someFoo =false; lock(OrdersActive) { someFoo= OrdersActive.Count()>0; }

// is this not double locking... and is it needed...? is that not the same as the get lock? someFoo= OrdersActive.Count()>0;

MicroTrendsTom commented 3 years ago

Current work Scope: i have removed one bottle neck from the workflow the rejection message - all it means is that you are long already when a long is attempted... or short when a short is attempted... if is optional to flag it as an error...

it might in fact be an overfill and could be closed prior to firing off another entry.. so checking needs to happen - on very fast market it will simple means carrying a position potentially unguarded until new order... so exits could be applied if none or it could be closed or it could be aggregated to the new long therefore the stop mechanism needs to adapt to the actual position not from the proposed etc

StrategyTradeWorkFlowState.GoLongValidationRejected will now goto Error mode... or back to waiting via poperty IsOnStrategyTradeWorkFlowStateEntryRejectionError

jmscraig commented 3 years ago

In Multi-threaded run: is it the double locking? .. No

The first lock expires on advent of the second curly bracket in the first block. The second lock is independent and required to prevent the first lock from taking place again while the second lock is in use.

An important question here that is a burden or a blessing is .. does the 'return ordersRT' return a copy?
Blessing: If so (is a copy) that copy is not locked and independent. Use of it does not require locking if only one thread will use it at a time.

Burden: Copying is slow and loads up Garbage Collector work.

MicroTrendsTom commented 3 years ago

The first lock expires on advent of the second curly bracket in the first block. A thing of beauty

jmscraig commented 3 years ago

I just added a related Enhancement Request/Issue.

I thought it was important to recognize that we want to yes write good logic that avoid rejections from Cancel Pending but we also actually move at speed so some we should embrace the expectation that we will get rejection messages.

A lot of my personal pursuit to reduce rejection popups is the the rejection but the WPF reliability loss from receiving and collecting a large stack of popups.

Addressing the Popup issue actually helps better achieve our reliability and execution goals and reduces some of Cancel Pending pain side of this thread (thought less so the deadlocks issue that is the current primary focus.

Here is the link to the issue/enhancement https://github.com/MicroTrendsLtd/NinjaTrader8/issues/19#issue-757280228

jmscraig commented 3 years ago

it might in fact be an overfill and could be closed prior to firing off another entry..

for pragmatic simplicity and reliability and speed/very low execution cost I have liked use of three class level bools stating the strategies expectation of position at that millisecond. bStrategyPositionLong bStrategyPositionShort and bStrategyPositionFlat

Super fast, low cost clarity on when there is a Position divergence or not so response can be immediate upon seeing the divergence.

I set these bools 'proactively' right as the strategy decides it should be long, short or flat. This requires your setting of strategy position perspective to be robust, and then after than it is all benefit.


FYI I am current testing on two machines (4 charts) the latest commits without any changes

Current work Scope: i have removed one bottle neck from the workflow the rejection message - all it means is that you are long already when a long is attempted... or short when a short is attempted... if is optional to flag it as an error...

Great! I needed the ability to open more than one position.

Signapore! I love Singapore!

Not directly related but don't want to open an issue just to reply to these ..

After the markets close.. We have much to catch up on..
I just love, love, love Signapore.
Great people. Great Culture. Great Vibe. Great Food. So much fun!

Amazing Work

In this post and a few others you have described amazing work.
https://github.com/MicroTrendsLtd/NinjaTrader8/issues/4#issuecomment-737043500

I don't impress easily and I am quite impressed!

I have questions or a hundred. Lol..

MicroTrendsTom commented 3 years ago

hungarian notation? bStrategyPositionLong nice to see that again - i still swear there were better technologis and methods back in the 1990s when responsible people were allowed to use multiple inheritance ha

MicroTrendsTom commented 3 years ago

Signapore! I love Singapore! - im actually trapped in Subic in the Philippines and had to start again over here - i was based in Penang, Thailand and Singapore - i turned down trading jobs as i dont want to wear a suit and be in an office all day and half the night ha - that might have been a mistake but time will tell - yep Singapore hawker centers the food is my thing. the beach sea not so good and the cost and unforgiving visa work laws.. .Thailand was easier and Malaysia

MicroTrendsTom commented 3 years ago

I don't impress easily and I am quite impressed! me also same same back at you - ty you might like this - http://cpc2.microtrends.pro/
its work in progress but does import in 1 minute period via SSIS from a db dump

my main concern is if i can believe the realtime sim and delta between live the solution is to use another demo account as the difference in live is big... not to be ignored.. sometimes its exchange rules and what not...

jmscraig commented 3 years ago

Im actually trapped in Subic in the Philippines and had to start again over here - i was based in Penang, Thailand and Singapore - Thailand was easier and Malaysia

Sound exciting .. I am from in the US Colorado, San Diego California and now Austin Texas. Love traveling the world. Had a great Filipino flatmate once but have not been there.

yep Singapore hawker centers the food is my thing. Oh ya.. and the Chili Crab.. oh my gosh I miss that.

MicroTrendsTom commented 3 years ago

Oh ya.. and the Chili Crab.. oh my gosh I miss that. that was it in fact the hawker center by the beach half way to the east - incredible value too

jmscraig commented 3 years ago

http://cpc2.microtrends.pro/

Wow.. will check that out

MicroTrendsTom commented 3 years ago

yep travelling since 2009 - Asia has some real pearls to see for sure... the Phils is easy visa -terrible internet and food but many good perks

MicroTrendsTom commented 3 years ago

"What I don't see in the log file and what I was expecting to see is: StrategyTradeWorkFlowState transition from GoShortCancelWorkingOrders to GoShortCancelWorkingOrdersPending prior to the call to CancelAllOrders()."

maybe it used a Goto case and skipped....but usually it would set the trade workflow local instance and it gets logged... will need to inspect the case structure

jmscraig commented 3 years ago

Right now stress testing the commits from three hours ago.

jmscraig commented 3 years ago

maybe it used a Goto case and skipped....but usually it would set the trade workflow local instance and it gets logged... will need to inspect the case structure

I looked the structure (looked correct) but did not look at timing / external who can go next and how soon logic part of the workflow.

I also believe that when NT8 struggles sometimes silly things happen and so decided to put it on the back burner unless it surfaces again

jmscraig commented 3 years ago

What are your thoughts on adding Limit Entry order capability and also dealing with partial fills?

MicroTrendsTom commented 3 years ago

yep good question and if you got a Microsoft email account i show you some example in TFS of how thats being done elsewhere -but its a migrated mess from some legacy of a multitude of coders - but it uses this as a base

MicroTrendsTom commented 3 years ago

partial fills - is a good topic for sure - fill or part fill and at market or move it works ok - or leave it...etc many options depends on the context and mode of use

MicroTrendsTom commented 3 years ago

in fact i have a customer who wants to use this but i dont have time to code for him.. so it could be an option for you if you are interested in that type of thing?