AI4Finance-Foundation / FinRL-Meta

FinRL­-Meta: Dynamic datasets and market environments for FinRL.
https://ai4finance.org
MIT License
1.28k stars 583 forks source link

Vix #81

Closed cryptocoinserver closed 2 years ago

cryptocoinserver commented 2 years ago

Tested it with yahoo finance processor. There is a problem though: The clean_data function removes the vix column again. Wanted to fix that, the code is pretty complicate though and it's probably better the original coder of it fixes it. Also the clean_data from yahoo finance uses backward fill? This introduces lookahead. Should be addressed too.

# if close on start date is NaN, fill data with first valid close
            # and set volume to 0.

So this PR isn't ready and needs more work. The other processor needs to be tested. It provides a solid foundation for making the vix work again though. Hope it helps.

rayrui312 commented 2 years ago

Thanks for your contributions!

I have some different thoughts about the first point, "removed the clean data code from the basic_processor because it's a duplicate. The cleaning steps are individual to each processor and should be / are done there." For now, We haven't unified all the data cleaning methods. So different processors may have different cleaning methods. We plan to unify these and remove data cleaning methods for each processor then. We will only use the cleaning method in the basic processor in the future.

For other points, they help a lot! Thanks very much. Could you please modify the part of the first point? We will merge this pull request then.

cryptocoinserver commented 2 years ago

Happy to help. Brought it back.

zhumingpassional commented 2 years ago

Thanks for your valuable codes.

I have several comments.

1) In finrl_meta/data_processors/processor_joinquant.py, fields=["time", "open", "high", "low", "close", "volume"], joinquant does not support the field “time”, which should not be revised. In clean_data, “date” will be changed to “time”.

2) finrl_meta/data_processors/processor_alpaca.py line 76, generally, if we executes download_data several times with different start-end dates using your codes, all these data will be stored in self.dataframe, except that we empty self.dataframe after each execution, which causes more labor. I think, it should not be revised.

cryptocoinserver commented 2 years ago
  1. Oh good catch. Misunderstood the fields parameter.
  2. But we need to be able to call it multiple times to get/add the vix. Where is the df emptied?
zhumingpassional commented 2 years ago

But we need to be able to call it multiple times to get/add the vix. response: generally, add vix is executed once. Therefore, this code should not be changed.