Nan loss Value - Githubissues

MrBeastgeek commented 1 year ago

Hello,

Thank you for publishing this package, it is really helpful. I tried to follow the starter example https://deepdow.readthedocs.io/en/latest/auto_examples/end_to_end/getting_started.html#sphx-glr-auto-examples-end-to-end-getting-started-py, and I applied it for my own portfolio dataset but I am getting Nan Values in the training phase.

How to fix this problem ? is there any other networks that I could experiment ?

Thank you in advance

jankrepl commented 1 year ago

Hey @MrBeastgeek and thank you for your interest!

I remember seeing a couple of similar issues and in all of the cases the problem was in the input data (e.g. prices equal to 0 which would lead to division by 0 when computing returns).

Anyway, I can definitely help further if you provide a minimal reproducible example.

pepi99 commented 11 months ago

Hey @MrBeastgeek and thank you for your interest!

I remember seeing a couple of similar issues and in all of the cases the problem was in the input data (e.g. prices equal to 0 which would lead to division by 0 when computing returns).

Anyway, I can definitely help further if you provide a minimal reproducible example.

Hello, @jankrepl . In your example "Basic example", you are setting X to be standardized returns per asset, and y as well. So n_channels will be 1, apparently, because we have just one feature which is the return (per asset).

I understand you have done this for the sake of the example.

In my case, I want to feed the model other data with multiple channels (12 in my case). I didn't change y (I just let it be the returns per coin), but I got an error:

35, in __init__
    raise ValueError('X and y need to have the same number of input channels.')
ValueError: X and y need to have the same number of input channels.

so what I did is basically change y to be the same as X, but with the gaps.

Now I get the same error with the loss, because I have 0 values for some of the features (which are not returns) sometimes.

So my question is basically is it possible to use different shapes of X and y - I want X to be some feature matrix independent of the returns (maybe also include the returns), and y to be solely the returns.

If yes, how to specify which dimension is the return column so that the benchmarks are computed.

If no, it seems that the model will try to predict y which is basically features, and I only want it to have returns as y.

pepi99 commented 11 months ago

To add to my question, do you think this framework is appropriate for the following problem:

I have historical positions (multiple per asset) that I want to use as features to generate final positions. Because if I have 50 positions for an asset, I would like to narrow it down to a single position using ML. And I want to optimize for + return and - risk.

jankrepl commented 11 months ago

Hey @MrBeastgeek and thank you for your interest! I remember seeing a couple of similar issues and in all of the cases the problem was in the input data (e.g. prices equal to 0 which would lead to division by 0 when computing returns). Anyway, I can definitely help further if you provide a minimal reproducible example.

Hello, @jankrepl . In your example "Basic example", you are setting X to be standardized returns per asset, and y as well. So n_channels will be 1, apparently, because we have just one feature which is the return (per asset).

I understand you have done this for the sake of the example.

In my case, I want to feed the model other data with multiple channels (12 in my case). I didn't change y (I just let it be the returns per coin), but I got an error:
35, in __init__
    raise ValueError('X and y need to have the same number of input channels.')
ValueError: X and y need to have the same number of input channels.
so what I did is basically change y to be the same as X, but with the gaps.

Now I get the same error with the loss, because I have 0 values for some of the features (which are not returns) sometimes.

So my question is basically is it possible to use different shapes of X and y - I want X to be some feature matrix independent of the returns (maybe also include the returns), and y to be solely the returns.

If yes, how to specify which dimension is the return column so that the benchmarks are computed.

If no, it seems that the model will try to predict y which is basically features, and I only want it to have returns as y.

Yeh, for training the X and y need to have same number of channels. That is a design choice. Note that it is done because then you can write custom losses that not only depend on the returns but also anything else that you have in your channels (e.g. volume, ...).

However, if you for some reason don't have the extra channels for y you simply insert some random channels into the tensor and then inside of the losses you just specify what channel is the "return" channel. See below an example loss

https://github.com/jankrepl/deepdow/blob/341b34d324cf8054f2f59f092b3afcdd031bc828/deepdow/losses.py#L739-L767

jankrepl commented 11 months ago

To add to my question, do you think this framework is appropriate for the following problem:

I have historical positions (multiple per asset) that I want to use as features to generate final positions. Because if I have 50 positions for an asset, I would like to narrow it down to a single position using ML. And I want to optimize for + return and - risk.

So my understanding is that your features are basically portfolio weights/absolute positions (I would imagine coming from some 3rd party portfolio managers) in the past and you are trying to predict optimal future portfolio weights?

Sounds reasonable to me and doable with deepdow (but hacky). And yes, this usecase is not the primary usecase I had in mind when building the tool.

In your case X does not contain historical returns, whereas y needs to contain them since you want to compute losses.

My understanding is that you are hitting this exception: https://github.com/jankrepl/deepdow/blob/341b34d324cf8054f2f59f092b3afcdd031bc828/deepdow/data/load.py#L38

In theory you can simply try to remove that condition and the training will still work out? I am not 100%, you need to play around with it and see:)

Edit: See below a common patterns how a lot of the predefined losses actually work

https://github.com/jankrepl/deepdow/blob/341b34d324cf8054f2f59f092b3afcdd031bc828/deepdow/losses.py#L794

You basically just extract a single channel from the y tensor corresponding to the returns and only do computations using the single channel. So there shouldn't be an issue:)

pepi99 commented 11 months ago

Thank you for your fast response and understanding my use case. You are right, I've got positions (n positions per asset, per hour) and I want these to be my actual features and y - the returns. Based on the positions that I already have (multiple per asset), my goal is to "combine" them, in other words, just make one final position per asset.

It seems that it is possible to do it with your library and I will play around with it. Thanks for the assitance!

jankrepl / deepdow

Nan loss Value #140