mdeverdelhan / ta4j-origins

A Java library for technical analysis ***Not maintained anymore, kept for archival purposes, see #192***
https://github.com/mdeverdelhan/ta4j/issues/192
MIT License
375 stars 224 forks source link

Tick.inPeriod() bug? #42

Closed fshsweden closed 9 years ago

fshsweden commented 9 years ago

I have a lot of recorded ticks (price + volume), and sometimes a few have exactly the same timestamp.

For this example, I want all my ticks to be collected within a TA4J Tick instance with period 1 second.

I use this code to round off my timestamp to nearest second: new_ts = ((ts + 500) / 1000) * 1000;

I keep track of lastTick, to see if this new tick is on the same second as last, and if so, I call addTrade instead.

if (lastTick != null && lastTick.inPeriod(d)) {
        lastTick.addTrade(tradeAmount, tradePrice);
        return lastTick;
    }
    else {
        Tick newTick = new Tick(Period.seconds(1), d);
        lastTick = newTick;
            newTick.addTrade(tradeAmount, tradePrice);      
            series.addTick(newTick);
    return newTick;
    }

Now the problem: inPeriod() returns false if endTime is equal or less to the given timestamp. Since all ticks within a second will end up having the same ts, inPeriod() must return true.

My fix was:

public boolean inPeriod(DateTime timestamp) {
    return timestamp != null
            && !timestamp.isBefore(beginTime)
            /* bugfix that works for me: endTime is included in period! */
            &&  (timestamp.isEqual(endTime) || timestamp.isBefore(endTime));
}

Is this the wrong way to "fix" the problem that will cause unwanted sideeffects later? Is it better if I "clean" my data and concatenate all ticks to a single tick or....what? Please guide me.

mdeverdelhan commented 9 years ago

Hi and thank you for your interest! :)

I don't understand well what you are trying to do. You seems to correctly feed your series with the trades you get, but one second seems pretty short for your ticks.

A few explanations (just in case):

Each "tick" represents the opening, closing, high and low prices of a stock for a time period. Thus you can run your strategy on each tick (instead of each trade the market executes, which may involves problems in illiquid markets due to the lack of trades). Normally, all the ticks in your series should have the same duration.

Therefore:

Smaller ticks are theoritically possible but I don't know if it would be relevant (because it would become too sensitive to extrema). Also, instead you may consider executing your strategy on each trade provided by your exchange.

Anyway, for now with ta4j:

About your work:

I don't understand why you want to round your timestamp to the nearest second. Why do you want to loose information?

Your fix may work but it allows something I want to avoid: you might "aggregate" a trade in two (or more) different ticks, which is a perfect non-sense for me (but I am open to anything).

lequer commented 9 years ago

If I may jump in ... I am a forex trader and tend to scalp. The lowest time frame I go down to is 1 minute, which my broker provides, which is great already, and this is really good enough my needs. Maybe you want to build your timeseries on the lowest time timeframe that your broker provides and Then react to the latest tick to react to the market. So you build the timeseries on the lowest timeframe provided , stream the latest ticks and react on them. Once the timeframe ends, you rebuild the timeseries and restart. At this stage ta4j does the job for me.

fshsweden commented 9 years ago

OK I feel that we are missing the point here. Let's ignore the round-to-nearest-second and the fact that we DO have sub-second ticks and also the fact that we get many, many ticks per day. We're talking (in a Carl Sagan voice) "billionzz-and-billionzz" of rows in the database.

Whether the time period is ms, sek, minutes, days, years or centuries shouldn't matter btw. Every time period has an open&close and high&low, as well as total volume. This is what I understood your Tick class was handling, by having an addTrade() method for all trades that occur within the Tick period.

Anyhow, we are using TA för intraday trading, and I was testing your framework on our data, by doing a quick hack of the TradingBotOnMovingTimeSeries.java example. It immediately gave exceptions on "Cannot add a tick with end time <= to series end time".

I read trades from the database for a whole day (symbol, timestamp, price and qty) and feed it into the program. In this particular testcase I had about 34000 trades. A few of them has the same timestamp. Whether the TimeSeries period is 1 second, 10 minutes or 1 hour shouldn't matter, right? For the sake of argument, I created Ticks with period 1 minute and 1 hour too, but got exactly the same exceptions.

Since it occurred every time we had two trades with the same timestamp, so I started to see if I could find a way around the problem.

The Tick class computes beginTime as the given time (stored in endTime) minus the period. Doesn't this complicate matters when you add trades sequentially? Wouldn't it be a lot easier if you took the given time as beginTime and set endTime to beginTime + period? Since you normally have trades in time-order with increasing timestamp? How else can I test that "this timestamp" belongs to the "current Tick" or if I should create a new Tick? I must have misunderstood something here.

Thats why I had the "lastTick.inPeriod(d)" call, to see if my tick was to be added to the current Tick object, else I would create a new Tick for the next timeperiod. Also, the rounding was just trying to avoid the problem with the endTime/beginTime I mentioned earlier.

So, to clearify, we are not talking HFT and we are talking intraday TA (of course in coop with other proprietary analysis). I just want to be able to feed in my data with duplicate timestamps.

mdeverdelhan commented 9 years ago

Ok. Sorry for the misunderstanding; I can't know a priori what's your level of skills when you first come. Also your explanations about timestamp rounding and one-second time periods made me think you were starting the wrong way. It's clearer now.

So:

The Tick class computes beginTime as the given time (stored in endTime) minus the period. Doesn't this complicate matters when you add trades sequentially? Wouldn't it be a lot easier if you took the given time as beginTime and set endTime to beginTime + period?

It's a matter of point of view: Yes, maybe you are right when you add trades sequentially. But a common use case is to feed a series with tick data directly, in which case the given time is always the end time.

Is this the wrong way to "fix" the problem that will cause unwanted sideeffects later?

Your fix makes the time bounds of the tick to be both inclusive. It comes with the problem that you may have a timestamp for which inPeriod() would return true for 2 consecutive ticks (and I wanted to avoid that). It opens the door to unwanted sideeffects and makes me think that something goes wrong (even if I can't hit the nail on the head).

For now I don't see the relationship between the inPeriod() return value and the fact that some of your trades have the same timestamp. In my mind if inPeriod() return false you have to create a new tick. All subsequent records with the same timestamp should be added to this new tick (inPeriod() would return true for them).

Check out this example: some trades have the same timestamp. See if it helps. If not, would you mind posting a SSCCE of a 3 trades case? Thus I could try to debug it and see if there is a bug.