Closed jmscraig closed 3 years ago
yes a simple check should suffice at that point and should be implemented etc.
"Resolving this one seems pretty simple .. I will see how far checking Order status to == Active or Working just prior to submission." I will wait for your feedback so i wont duplicate code/work etc
"at what point do we consider a transition to setting error handling to "IgnoreAllErrors"" i think this can work also, if we are covering rejects and other anomalies we dont need to see them popup etc
Update on proposed Cancel Pending fixes and hunting for deadlock root cause
One of the two test machines is still running wonderfully and has completed buy and sell transactions on 4800 ES contracts without a single issue. This machine only has three of of the parms bools turned on.. DEBUG - IsTracingMode, FlattenOnTrasition and IsStrategyUnsafeMode.
The second machine completed buy and sell transactions on 4500 NQ contracts ran until 11:59 PM when NT8 crashed and did a full and complete hard unplanned shutdown. During the 4500 contracts there were only two sets of popups with rejections due to Order Cancel Pending Status.
So significant improvement reducing popups with rejections due to Order Cancel Pending Status. Yea a win!
Maybe you know.. I am hoping the NT8 crash had something to do with Notepad's inability to handle the size of the ATS.NT8 log files (see attachments). I just installed Notepad++ to clear this problem.
Checking the log files to see if I can find cause of the crash
Most recent ATS.NT8 log entry .. Wow.. Crashed right at Midnight local time. Last log entry was 522 Milliseconds before Midnight.
In this file AlgoSystemBase looks like might have been in 'waiting state during crash so not much to see here...
2020.11.29 23:59:59:951 Sim101 jcNQSlowQuadro ES 12-20 CurrentThread.Name.?|DS= ES 12-20 (6 Tick)|BT=2020.11.29 23:59:58:488|HR=R|CB=11536|LC=3614.75|RX=52|RO=1501|MP=Short|PQ=4|AO=8|WF=Waiting|S=Realtime|: OnOrderUpdate(↑Trg3#348 OrderId=0865346c4c8d40e5a948bc93699ac610 State=Accepted)
2020.11.29 23:59:59:952 Sim101 jcNQSlowQuadro ES 12-20 CurrentThread.Name.?|DS= ES 12-20 (6 Tick)|BT=2020.11.29 23:59:58:488|HR=R|CB=11536|LC=3614.75|RX=52|RO=1501|MP=Short|PQ=4|AO=8|WF=Waiting|S=Realtime|: OnOrderUpdate(↑Trg3#348 OrderId=0865346c4c8d40e5a948bc93699ac610 State=Working)
I notice this file has the Instrument and Bar Type and Period already listed Cool.
The NT8 trace file shows the VPS just ran of our memory at midnight. All the logging must have been too much for it. Good news that there is no sign of any Algo system problems.
2020-12-01 23:56:53:086 unhandled exception trapped 2020-12-01 23:56:53:087 Insufficient memory to continue the execution of the program. 2020-12-01 23:56:53:087 System.OutOfMemoryException: Insufficient memory to continue the execution of the program.
wow thats a new one i wonder if the notepad popup was the cuase - i will make that popup thiny a parameter option
in fact i will make a new issue for that and we can confirm this one is ok?
Notepad will popup when an error rejection occurs -and so that last chunk of text at the bottom will be the last error etc
Anyway i will leave this open until you are 100% sure. you can add a Pull Request if you like for the sample strat unless you are using a different etc
Morning.
I am using a evolved copy of the sample strategy with a collection of notes and the changes I am testing.
I will populate favored changes into a clean version of the sample strategy and submit a pull request.
Both ran through to the morning without deadlocking. Both had fatal errors and were disabled this morning. MES with UnSafe mode enabled was disabled at ~6:45am and MNQ at 7:01:55.
There were significant (not unusual) market moves at these time and also not unusually high volume.
Both fatal crashes were unusually similar. Both started
Ideas on Fixes
The first two attachments are from the client that did NOT have UnSafe enabled
This third attachment is from the client that had UNSAFE ENABLED.
During this test the error Unable to verify ErrorFlattenAllPending was the only visible difference between between running UnSafe enabled and not.
Your thoughts?
"wow thats a new one i wonder if the notepad popup was the cuase - i will make that popup thiny a parameter option"
I saw the issue you created and closed for this. Looks good.
Another Fix idea.
Wrapping the internals of CancelAllOrders() in a try-catch might have prevented this error from being fatal and having the catch allows capabilities for better logging and possibly a direct response to the error.
I am going to populate some of those ideas in my local copy of AlgoSystemBase and test them..
Still like to hear your thoughts and ideas.
FYI.. I intend to run overnight tests against the proposed fixes for Rejected - order Cancel Pending status -- integrated into the AlgoSystemBase you commited today.
Your thoughts? i also need to conduct some tests to assess this flatten all etc... Sounds like exit orders were cancelled midlfight and gave the cannot be submitted error etc The object ref error needs trapping the cycle between cancels is decide bu the retry mechanims - in theory
i need to test similar and understand the context a bit better etc
"i need to test similar and understand the context a bit better etc"
I have a simple timed delay solution to test and submit for this.. was trying to do that last night integrated into yesterdays commits but it keep crashing.. will check on using todays commits
Testing todays release on two machines with only parameter change being to reduce Fast and Slow MA length to keep the workflow engine busy..
If that runs well will migrate in proposed fixes for CancelPending Rejects and test those.
No issues yet. Now testing the proposed CancelPendingFix in one of two running test clients
Regarding proposed fixes to reduce the number of order rejections due to order status of Cancel Pending.
Simple is good. With the simple updates in the code below I am trying to address two causes of order rejects due to Cancel Pending Order Status.
Note: Order.Status does not update quickly enough to be a reliable identify of CancelPending order status.
In order to restrict unwanted work flow actions I decided to to stay in genre for patterns already appearing this code and go with/propose simple timer based delays.
flattenOrCancelOrdersInitiatedNoticeEndTime (DateTime var) - create that flag to be used at end workflow action points (E.g. TradeManagement() ) that ensures a prevention of ChangeOrder submissions while we Flatten, Close Position, Cancel Orders actions under way.
cancelOrdersSubmitDelayEndTime (DateTime var) - intended to ensure a short dealy exists between CancelOrders submissions to allow the exchange facing servers time enough complete the cancels and or get status order updates populated through the systems to our following CancelOrder calls never take place or if they do generate fewer errors.
The intent was to first test 400ms delay to give OCOs what usually should be plenty of time to cancel, (MAX of 200ms per side).
cancelOrdersSubmitDelayEndTime = DateTime.Now.AddMilliseconds(400);
The workflow engine goes into a loop generating far more orders than intended.
Even when set to 1ms the large unintended SubmitOrder loops occur cancelOrdersSubmitDelayEndTime = DateTime.Now.AddMilliseconds(1);
GoLongSubmitOrderWorking:> OnOrderUpdate(↓Trg GoLongSubmitOrderWorking:> OnOrderUpdate(↓Trg GoLongSubmitOrderWorking:> OnOrderUpdate(↑Stp
or
GoShortSubmitOrderWorking:> OnOrderUpdate(↑Trg GoShortSubmitOrderWorking:> OnOrderUpdate(↓Stp GoShortSubmitOrderWorking:> OnOrderUpdate(↑Trg
I noticed the logical content of the cases looping is pretty thin:
"case StrategyTradeWorkFlowState.GoLongSubmitOrderWorking: TradeWorkFlowOnMarketDataDisable(); break;"
code diff
So my three questions are:
**1. Do you like and want to use the flattenOrCancelOrdersInitiatedNoticeEndTime and cancelOrdersSubmitDelayEndTime concepts?
The full ATS.NT* tracefile that goes with those images.
Note: Unsafe was not enabled
How I had expected this would work:
What I don't see in the log file and what I was expecting to see is:
StrategyTradeWorkFlowState transition from GoShortCancelWorkingOrders to GoShortCancelWorkingOrdersPending prior to the call to CancelAllOrders().
The Trace log does not show this transition from GoShortCancelWorkingOrdersPending where expected.
Then ...
If a call to CancelAllOrders() had taken place to CancelAllOrders() in the last millisecond the delay timer if(..) in CancelAllOrders() would returned the call early with no execution.
Upon return of the call to case GoShortCancelWorkingOrders the following logic would be enabled driving retries or eventually an error.
TradeWorkFlowOnMarketDataEnable(); //will continue to loop back here forever unless we have a timeout tradeWorkFlowRetryCount++; if (tradeWorkFlowRetryCount > tradeWorkFlowRetryAlarm) return ProcessWorkFlow(StrategyTradeWorkFlowState.Error);
One might say, 'Sell all those rows state labeled as GoShortCancelWorkingOrders?
'Yabutt' by then State should have transitioned to GoShortCancelWorkingOrdersPending and it has not.
Also, the timer only delayed action for 1m, state remains the same in the log for 4000ms.
I could force a timer test into each case that calls it, but that is not quality code work so decided to check in with you and get your advice.
===================================================
case StrategyTradeWorkFlowState.GoShortCancelWorkingOrders:
if (IsHistoricalTradeOrPlayBack || IsStrategyUnSafeMode)
{
CancelAllOrders();
TradeWorkFlow = StrategyTradeWorkFlowState.GoShortCancelWorkingOrdersConfirmed;
goto case StrategyTradeWorkFlowState.GoShortCancelWorkingOrdersConfirmed;
}
else
{
if (connectionStatusOrder == ConnectionStatus.Connected)
{
TradeWorkFlow = StrategyTradeWorkFlowState.GoShortCancelWorkingOrdersPending
CancelAllOrders();
}
TradeWorkFlowOnMarketDataEnable();
//will continue to loop back here forever unless we have a timeout
tradeWorkFlowRetryCount++;
if (tradeWorkFlowRetryCount > tradeWorkFlowRetryAlarm)
return ProcessWorkFlow(StrategyTradeWorkFlowState.Error);
}
break;
So my three questions are:
- Do you like and want to use the flattenOrCancelOrdersInitiatedNoticeEndTime and cancelOrdersSubmitDelayEndTime concepts?
- If so how to we make them well with the workflow engine?
- Better ideas? Existing solutions?
Do you like and want to use the flattenOrCancelOrdersInitiatedNoticeEndTime and cancelOrdersSubmitDelayEndTime concepts? Yes but in a calling method or an override of the cancelAllOrders etc
Note: This was never necessary in the past trade engines from which this was roughly hewn and migrated to NT8 something must be at fault to do with locking and event processing or the timing/sequence of calls/....
Yes: OrderStates are unreliable - in fact to measure them the real way to do it is wait for the exchange message back - Heisenberg style. Alt: We can look to see where price is relative to an order and decide to cancel or change it or go ahead and see what message comes back etc
catch up ok i now caught up with this thread. Sorry i missed that. so there is a lot of changes to main -that might have assisted this... So far no errors this end ran all night in NQ 10 tick sma 5,12 etc but one deadlock caught in the debugger - and rectified. So a retest and then look at the above caveats...etc
Modes - horses for courses is one item to think of ...etc
if you want to squeeze more out the engine then unsafe mode with some lightweight process and some post trade fix/monitor
This testing is unrealistic to the application of the system it would be impossible to make a profitable system with reversals in that series with a retail system - for each trade the CQG system has to do a lookup across the network for pre trade validation and more besides -anything needing less that 1 second as critical in retail trading is probably doomed - to process submit and so reverse... 1 seconds to submti and get all exits in place is more realistic then unlock to accpet a trade and so within 2 to 3 secs on average all itesm are good is 99% of all the normal needs.
beyond this a change of approach is vital... very simpls a mode such as: stop entry orders in the market - and fast closes - by using the limit orders etc
or or exit close wth target limits to move past price etc
What is the safe operating mode and Benchmark? Ok so if this is the benchmark which it goes crash bang wallop - what is the benchmark where it is in safe bounds? This rope snaps at 100KG what is its safe operation ? 10KG to 50KG etc? a realistic test taht could stand a chance of being tenable in trading $$$ terms must also undertaken
WORKFLOW LOOP Worfklow loop is alarming iver not see this or replicated it - that might need to be a new bug as this one was about order rejections or is this part of that?
So it this what is happening? So an order rejections loop caused by
a call to cancelled and items are locked new orders placed the cancel going through and cancelling them as they are submitted as well as the items to be cancelled?
Adding a close with target mode.... for positonclose
I should have posted this comment here. https://github.com/MicroTrendsLtd/NinjaTrader8/issues/18#issuecomment-738866516
running with out 1 hitch in 1 year i do use small fine grain locks such as lock (account.Positions)"
Great!
I have been reading on this since I got up.
After trying to ferret out the object model I believe we should test
lock (Account.Orders) Account.CancelAllOrders(this.Instrument);
and if that does not work then capitulate and test.
lock (Account.All) Account.CancelAllOrders(this.Instrument);
Pull down last nights commits now
lock (Account.Orders) ok good
A Question on locking basics my brains is down to one brain cell the other is on strike
public List
so if we now do a lock somewhere else bool someFoo =false; lock(OrdersActive) { someFoo= OrdersActive.Count()>0; }
// is this not double locking... and is it needed...? is that not the same as the get lock? someFoo= OrdersActive.Count()>0;
Current work Scope: i have removed one bottle neck from the workflow the rejection message - all it means is that you are long already when a long is attempted... or short when a short is attempted... if is optional to flag it as an error...
it might in fact be an overfill and could be closed prior to firing off another entry.. so checking needs to happen - on very fast market it will simple means carrying a position potentially unguarded until new order... so exits could be applied if none or it could be closed or it could be aggregated to the new long therefore the stop mechanism needs to adapt to the actual position not from the proposed etc
StrategyTradeWorkFlowState.GoLongValidationRejected will now goto Error mode... or back to waiting via poperty IsOnStrategyTradeWorkFlowStateEntryRejectionError
In Multi-threaded run: is it the double locking? .. No
The first lock expires on advent of the second curly bracket in the first block. The second lock is independent and required to prevent the first lock from taking place again while the second lock is in use.
An important question here that is a burden or a blessing is .. does the 'return ordersRT' return a copy?
Blessing: If so (is a copy) that copy is not locked and independent. Use of it does not require locking if only one thread will use it at a time.
Burden: Copying is slow and loads up Garbage Collector work.
The first lock expires on advent of the second curly bracket in the first block. A thing of beauty
I just added a related Enhancement Request/Issue.
I thought it was important to recognize that we want to yes write good logic that avoid rejections from Cancel Pending but we also actually move at speed so some we should embrace the expectation that we will get rejection messages.
A lot of my personal pursuit to reduce rejection popups is the the rejection but the WPF reliability loss from receiving and collecting a large stack of popups.
Addressing the Popup issue actually helps better achieve our reliability and execution goals and reduces some of Cancel Pending pain side of this thread (thought less so the deadlocks issue that is the current primary focus.
Here is the link to the issue/enhancement https://github.com/MicroTrendsLtd/NinjaTrader8/issues/19#issue-757280228
it might in fact be an overfill and could be closed prior to firing off another entry..
for pragmatic simplicity and reliability and speed/very low execution cost I have liked use of three class level bools stating the strategies expectation of position at that millisecond. bStrategyPositionLong bStrategyPositionShort and bStrategyPositionFlat
Super fast, low cost clarity on when there is a Position divergence or not so response can be immediate upon seeing the divergence.
I set these bools 'proactively' right as the strategy decides it should be long, short or flat. This requires your setting of strategy position perspective to be robust, and then after than it is all benefit.
FYI I am current testing on two machines (4 charts) the latest commits without any changes
Current work Scope: i have removed one bottle neck from the workflow the rejection message - all it means is that you are long already when a long is attempted... or short when a short is attempted... if is optional to flag it as an error...
Great! I needed the ability to open more than one position.
Not directly related but don't want to open an issue just to reply to these ..
After the markets close.. We have much to catch up on..
I just love, love, love Signapore.
Great people. Great Culture. Great Vibe. Great Food. So much fun!
In this post and a few others you have described amazing work.
https://github.com/MicroTrendsLtd/NinjaTrader8/issues/4#issuecomment-737043500
I don't impress easily and I am quite impressed!
I have questions or a hundred. Lol..
hungarian notation? bStrategyPositionLong nice to see that again - i still swear there were better technologis and methods back in the 1990s when responsible people were allowed to use multiple inheritance ha
Signapore! I love Singapore! - im actually trapped in Subic in the Philippines and had to start again over here - i was based in Penang, Thailand and Singapore - i turned down trading jobs as i dont want to wear a suit and be in an office all day and half the night ha - that might have been a mistake but time will tell - yep Singapore hawker centers the food is my thing. the beach sea not so good and the cost and unforgiving visa work laws.. .Thailand was easier and Malaysia
I don't impress easily and I am quite impressed! me also same same back at you - ty you might like this - http://cpc2.microtrends.pro/
its work in progress but does import in 1 minute period via SSIS from a db dump
my main concern is if i can believe the realtime sim and delta between live the solution is to use another demo account as the difference in live is big... not to be ignored.. sometimes its exchange rules and what not...
Im actually trapped in Subic in the Philippines and had to start again over here - i was based in Penang, Thailand and Singapore - Thailand was easier and Malaysia
Sound exciting .. I am from in the US Colorado, San Diego California and now Austin Texas. Love traveling the world. Had a great Filipino flatmate once but have not been there.
yep Singapore hawker centers the food is my thing. Oh ya.. and the Chili Crab.. oh my gosh I miss that.
Oh ya.. and the Chili Crab.. oh my gosh I miss that. that was it in fact the hawker center by the beach half way to the east - incredible value too
Wow.. will check that out
yep travelling since 2009 - Asia has some real pearls to see for sure... the Phils is easy visa -terrible internet and food but many good perks
"What I don't see in the log file and what I was expecting to see is: StrategyTradeWorkFlowState transition from GoShortCancelWorkingOrders to GoShortCancelWorkingOrdersPending prior to the call to CancelAllOrders()."
maybe it used a Goto case and skipped....but usually it would set the trade workflow local instance and it gets logged... will need to inspect the case structure
Right now stress testing the commits from three hours ago.
maybe it used a Goto case and skipped....but usually it would set the trade workflow local instance and it gets logged... will need to inspect the case structure
I looked the structure (looked correct) but did not look at timing / external who can go next and how soon logic part of the workflow.
I also believe that when NT8 struggles sometimes silly things happen and so decided to put it on the back burner unless it surfaces again
What are your thoughts on adding Limit Entry order capability and also dealing with partial fills?
yep good question and if you got a Microsoft email account i show you some example in TFS of how thats being done elsewhere -but its a migrated mess from some legacy of a multitude of coders - but it uses this as a base
partial fills - is a good topic for sure - fill or part fill and at market or move it works ok - or leave it...etc many options depends on the context and mode of use
in fact i have a customer who wants to use this but i dont have time to code for him.. so it could be an option for you if you are interested in that type of thing?
Greetings! Happy Monday. I have 20+ of hours test execution completed against the latest code posted here using my development environment overnight processing with live market date on a Sim account on a VPS during high volume daytime hours.
Completed thousands of successful trade executions but surfaced three issues.
1) Deadlocks 2) LIST<> read write index related conflicts, 3) ChangeOrder submission rejects because the order is in “Cancel Pending” status
3) ChangeOrder submission rejects because the order is in “Cancel Pending” status
Resolving this one seems pretty simple .. I will see how far checking Order status to == Active or Working just prior to submission.
Just thinking out loud... If that does not work and we find that the Popup error messages have a connection with NT8 / strategy stability then at what point do we consider a transition to setting error handling to "IgnoreAllErrors"
Your thoughts? Code fixes to paste in? Better Ideas? James