luisdamiano / gsoc17-hhmm

Bayesian Hierarchical Hidden Markov Models applied to financial time series, a research replication project for Google Summer of Code 2017.
Creative Commons Attribution Share Alike 4.0 International
116 stars 43 forks source link

main-sim.R #1

Open isaactarume opened 5 years ago

isaactarume commented 5 years ago

gsoc17-hhmm/tayal2009/main-sim.R

Hi there Am trying to re run the simulations you supposed did and l keep on getting the error as below any ideas

Model estimation --------------------------------------------------------

rstan_options(auto_write = TRUE) options(mc.cores = parallel::detectCores()) stan.model = 'tayal2009/stan/hhmm-tayal2009.stan' stan.data = list(

  • T = T.length,
  • K = K,
  • L = L,
  • G = G,
  • g = g[dataset$z],
  • x = ifelse(dataset$x < 10, dataset$x, dataset$x - 9)
  • ) Error: object 'G' not found stan.fit <- stan(file = stan.model,
  • model_name = stan.model,
  • data = stan.data, verbose = T,
  • iter = n.iter, warmup = n.warmup,
  • thin = n.thin, chains = n.chains,
  • cores = n.cores, seed = n.seed)#w, init = init_fun)

TRANSLATING MODEL 'tayal2009/stan/hhmm-tayal2009.stan' FROM Stan CODE TO C++ CODE NOW. successful in parsing the Stan model 'tayal2009/stan/hhmm-tayal2009.stan'.

CHECKING DATA AND PREPROCESSING FOR MODEL 'tayal2009/stan/hhmm-tayal2009.stan' NOW.

COMPILING MODEL 'tayal2009/stan/hhmm-tayal2009.stan' NOW.

STARTING SAMPLER FOR MODEL 'tayal2009/stan/hhmm-tayal2009.stan' NOW. Error in new_CppObject_xp(fields$.module, fields$.pointer, ...) : Exception: variable does not exist; processing stage=data initialization; variable name=sign; base type=int (in 'model6118615d1dc6_tayal2009_stan_hhmm_tayal2009_stan' at line 12)

In addition: Warning message: In FUN(X[[i]], ...) : data with name sign is not numeric and not used failed to create the sampler; sampling not done

michaelweylandt commented 5 years ago

Hmmm.... There are two errors here.

The first looks like a typo in main-sim.R: the Stan program being called doesn't take a G variable, so that should be safe to remove.

The second looks like the sign variable is not being created in main-sim.R. From related code in main.R, it looks like it could be created as ifelse(dataset$x < L + 1, 1, 2).

@luisdamiano Does that look right to you?

isaactarume commented 5 years ago

Thanks mate it work perfect after!

That however am trying to convert this code to predict anomalies using HHMM. Any ideas?

michaelweylandt commented 5 years ago

I’m not really sure what you mean by predict anomalies. Can you give more detail? Is there a paper that describes the methodology you want to implement?

isaactarume commented 5 years ago

Yes, basically I want to predict credit card fraud using hierarchical hidden markov, the same methodology described initially by S FINE - ‎1998 and sort of partially implemented in your paper. The idea is to convert the HHMM into an equivalent HMM, l have done that am not quiet sure whether its working because my results for the original HMM and HHMM (the converted one) are just the same. All this is in C#. The HMMs which am trying to convert to HHMM were used in the papers below successfully.

  1. Credit Card Fraud Detection Using Hidden Markov Model, 2012 ,Shailesh S. Dhok
  2. Credit Card Fraud Detection Using Hidden Markov Model ,2008 ,Abhinav Srivastava ; Amlan Kundu ; Shamik Sural ; Arun Majumdar
luisdamiano commented 5 years ago

Hmmm.... There are two errors here.

The first looks like a typo in main-sim.R: the Stan program being called doesn't take a G variable, so that should be safe to remove.

The second looks like the sign variable is not being created in main-sim.R. From related code in main.R, it looks like it could be created as ifelse(dataset$x < L + 1, 1, 2).

@luisdamiano Does that look right to you?

Sounds about right, I also found ifelse(dataset$x < L + 1, 1, 2) in the .Rmd file but that lines returns all ones which doesn't really make much sense I think. I'll have to look into it a bit further.

michaelweylandt commented 5 years ago

@isaactarume Are [1,2] the papers you're trying to extend?

If so, I'm still a bit confused by your question: I don't see any hierarchical structure in either paper, so what conversion are you trying to do? When [2] uses the word "hierarchical", they don't mean a HHMM in the type we consider here (where there are known structures in the hidden states).

[1] http://www.ijarcs.info/index.php/Ijarcs/article/view/1239/1227 [2] https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=4358713

isaactarume commented 5 years ago

Hi Yes thats true there are no hierarchical structures in the papers cited above, but the credit card fraud problem l need to solve for have got hierarchical structures which are either imagined or real but thats what we need to find out Remember l have already solved using HMM and now want to implement the HHMM solution, and yes l understand it becomes difficult to contextualize without the full brief of our problem, but basically we need to get this code to work in our scenario and we struggling to adapt it to detect fraud instead of predicting as you are doing here

In a nutshell my question is How would you go about detecting anomalies even in this stock market data using the same code. Assuming there was some fraud in one of the listings pricing and you want to detect that fraud, Would that be possible to detect?

michaelweylandt commented 5 years ago

Hi @isaactarume,

I'm honestly a bit confused what you're asking. There are many methods for time series anomaly detection, some of which may use HMMs, but the best models for anything are always grounded in some sort of domain specific knowledge. The models implemented in this repo are mainly for price activity prediction and don't really deal with anomalies, so this may not be the best resource.

When you say "I have already solved using HMM and now want to implement the HHMM solution," I have trouble following. For any given HMM, there's not an "automatic" or even canonical transformation into an HHMM. The usefulness of the HHMM framework comes from prior knowledge about the hidden states of the system and the transition dynamics between them - if you don't know any of this, I don't think there's much value in using an HHMM over an HMM.

You have not provided any details about your problem -- either as working code, pseudo-code, math, or even an explanation of the underlying physical mechanisms -- so we can't help you. This thread started with a straight-forward question about an undefined variable error, but metastized into a discussion about HHMMs, credit cards, and anomaly detection somewhere along the way.

Can you please re-state your question in a self-contained way, providing us with a clear description of where you are, what you've tried, and where you want to go? Without all three of those, there's really nothing we can do to help you.

isaactarume commented 5 years ago

Hi there Thanks for the attempt to assist really appreaciate it

Yes, the problem am trying to solve for is fraud detection like the papers up there [1] http://www.ijarcs.info/index.php/Ijarcs/article/view/1239/1227 [2] https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=4358713

However in order to try and improve the performance matrices(accuracy , recall) we introduced an HHMM, and yes there is not much improvement as you are saying. So we thought maybe out HHMM implementation is not working hence we wanted to try your HHMM R implementation.

If l here you correctly you seem to suggesting that even if we were to try the R HHMM implementation here, there might be little success in improving the metrices becoz there is still no prior knowledge. I still wanted to try it nevertheless so we can conclude that?

michaelweylandt commented 5 years ago

To test whether your HHMM is coded properly, I'd try fake data simulation. That is, simulate some data with known true values of the parameters / states and see how well your model is able to recover them. You can do this ad hoc by just picking a few values and eye-balling it or using a more formal evaluation method such as [1].

But to the bigger point, I wouldn't expect just switching to an HHMM to buy you anything unless you have additional prior knowledge or structure that you can impose in the HHMM formulation. At a high-level, an HHMM is just a (large) HMM with several elements of the transition probability matrix fixed. If you don't have any knowledge that lets you fix these a priori, an HHMM simplifies to an HMM. -- @luisdamiano, please correct me if your experience is different here.

[1] https://arxiv.org/abs/1804.06788

luisdamiano commented 5 years ago

@michaelweylandt Yes, any HHMM can be rewritten as a HMM. A HHMM is like having more than one HMM (each with one transition matrix each) connected by another HMM. All these transitions can be "expanded" into a single, large transition with possibly a few restrictions (e.g. a zero between some of the states that are not connected by the hierarchy). It may not be the most elegant and efficient approach, but results should be equivalent.

@isaactarume First, I'm not surprised that a HHMM didn't give you any improvement. It seems that you took a credit fraud model originally created as a HMM and you arbitrarily added a hierarchy even though you didn't find any domain-specific theory that would back the hierarchy established. For a HHMM to lead to a gain in information, you need you think about the theoretical aspects of credit fraud and then try to create a structure representing the complexities the financial theory describe.

It seems that you may have just fit a HHMM to get a "better fit", but unfortunately going from HMM to HHMM will not improve prediction by itself. This is not like adding a quadratic term in a linear regression, or adding some extra layers in neural network, just to get a better goodness of fit and/or cross validation. Going from HMM to HHMM is only worthy if you have some financial model that you translate into a hierarchy of latent models.

Additionally, now that you mentioned you fit a HHMM to your data, what software did you use to fit it? Did you use any public R Package or Python lib, or did you write some code yourself? From what I recall, the algorithms behind estimating the parameters of a HHMM are involved, so I was curious about what software you used.

isaactarume commented 5 years ago

Thanks guys for the great insights you couldn't have put it better. Thats exactly was my methodology converted the HMM into an HHMM almost arbitrary but not as such.

I have induced certain domain specific knowledge into my hidden state transitions(Matrix A) . Basically there are certain restrictions in the A matrix which talks talk the dynamics of the credit card transaction. I even tried various scenarios which has different A matrix where some transitions are aven zero probability as you have in your paper.

Theere were few runs where there is improvement from HMM to HHMM but then there lack consistency in runs.

In terms of the implementation is a C# code library called Accord (Github accord) which l used as ita already have the HMM classes, l then extended the classes in order to have the HHMM implementations. Hope rhis is much more clearer. I deally l would want to implement same code here is R and then have your HHMM function here in R just to make sure everything is working fine.

michaelweylandt commented 5 years ago

When you say "there were a few runs where there is improvement from HMM to HHMM," were these runs on your actual data or on fake data simulated from the priors & from the HHMM model?

If the latter, then there's an issue with your inference; if the former, then your data might not be well modeled by an HHMM.

isaactarume commented 5 years ago

Yes it was from the synthetic data simulated from priors and HMM model. So l initially created an HMM model which generate the training data sets.

I then use this data for training & testing for both the HMM and the HHMM

michaelweylandt commented 5 years ago

If you simulated the data from the HMM, I'm not surprised that an HHMM didn't do better / did slightly worse. Bayesian decision theory says that the Bayesian estimator with the "right" priors is essentially impossible to beat on the model. (Of course, in real life, we're basically always off the model)

Does your fake data simulation favor the HHMM when simulating from the HHMM? (It should, but it's always good to check)

It's good that you have a priori knowledge about a HHMM type structure (and it's definitely a good idea to try to use it), but if you're putting that in the HHMM model (i.e., the likelihood) but not into your simulation's data generating process, there's a mismatch which will cause problems. (Since the HHMM and HMM are close, these problems may not too large in practice, as you've noted.)

isaactarume commented 5 years ago

Hi Michael Yes thanks for those insights that's exactly what might have happened although my data is a actually a mixture of different HMM models.

Haven't tried HHMM simulated data its high time l should try it as well. But now i still need to prove the best model perhaps l shud have equal mixture of HMM and HHMM simulated data and train from that set. Which ever come best will be my model

Yes there might be a mismatch when l add priors but the models seems to dealing with that on training and its not affecting much. Will try the several options you suggested and see what comes out?

michaelweylandt commented 5 years ago

That all seems reasonable. If you're uncertain about a few model structures, you might also try some sort of predictive model combination, e.g., the models discussed in https://projecteuclid.org/euclid.ba/1516093227. The loo package inR will implement many of the methods from that paper.