scrtlabs / catalyst

An Algorithmic Trading Library for Crypto-Assets in Python
http://enigma.co
Apache License 2.0
2.49k stars 725 forks source link

Functions to extract OHLCV data #504

Open sam31415 opened 6 years ago

sam31415 commented 6 years ago

Extracting the price data for research purpose is currently rather cumbersome and very slow, requiring to run a backtest. I wrote two functions that simply read the bcolz data and returns OHLCV data for either a single pair or a list of pairs. I'd be happy to submit a pull request to include them in the catalyst codebase so that everybody can use them, but I'm not sure where to put them.

Do you have any recommendation? Maybe we could create a new research.py file in catalyst/catalyst/data to gather tools useful to do research, as opposed to backtesting.

lenak25 commented 6 years ago

Thanks @sam31415 !

I would suggest to add the new functions to a new directory - catalyst/utils - as they are catalyst utility functions (the catalyst code base is under catalyst/catalyst so stand alone functions that do not necessary run within catalyst itself can be in a different hierarchy).

Thanks, Lena

sam31415 commented 6 years ago

Actually, my installation of Catalyst already has a directory catalyst/utils, filled with files inherited from zipline, I presume. I'll add a research.py file there.

lenak25 commented 6 years ago

You don't mean catalyst/catalyst/utils?

sam31415 commented 6 years ago

Ah sorry for the confusion. Now I see my catalyst folder is a bit weird, many of the folders in catalyst/catalyst/ are duplicated in catalyst/... That's probably not normal. So should I add the files to catalyst/utils or catalyst/catalyst/utils? If I want it to be possible to import using from catalyst.utils import ... I should probably add my code incatalyst/catalyst/utils, right?

lenak25 commented 6 years ago

Yes, you are correct: if you want to add your code using from catalyst.utils import ... your functions should be under catalyst/catalyst/utils.

Then perhaps a better idea will be - similarly to what you originally proposed - a new directory named research located under catalyst/catalyst.

sam31415 commented 6 years ago

Ok. I still have a problem. I added research.py to catalyst/catalyst/utils but I can't import it. dir(catalyst.utils) doesn't show the new file as a submodule. Do I need to specify somewhere that this file is importable? The __init__.py file in utils is empty.

lenak25 commented 6 years ago

This should work. You should be able to import your module using: import catalyst.utils.research and then be able to run your function using: catalyst.utils.research.your_fucntion_name(). But please do not use utils directory but rather create a new directory research to add your new module to (and add an empty __init__.py to the new research directory) and by following the same logic, run your function with:

from catalyst.research import research
research.your_function_name()
sam31415 commented 6 years ago

Thanks for the tips. Yes, I also thought this should work, but it doesn't... Using a new research directory doesn't allow imports either. Does Catalyst need to be reinstalled with some changes configuring research as an importable directory? dir(catalyst) doesn't show the new research module... I must be missing something obvious...

lenak25 commented 6 years ago

Did you install catalyst from the source code allowing you to run the most updated lib, using: pip install -e <catalyst root dir>?

sam31415 commented 6 years ago

Ah maybe that's the problem. When I try to run this command I get a gcc error, gcc: error: unrecognized command line option '-mno-cygwin'. Reinstalling gcc doesn't help. I'll look into this more when I have some time. Anyway, thanks!

lenak25 commented 6 years ago

Hi @sam31415 , were you able to overcome your issues?

If you still have trouble with creating a development environment please have a look at this section of the docs.

sam31415 commented 5 years ago

Hi @lenak25 Installing Catalyst and making everything work has been a big pain and I don't really feel like starting again, unless I'm absolutely sure this is the source of the problem. I've been looking into the reason why the new directory doesn't appear in the catalyst namespace, but haven't found anything conclusive so far.

usgoose commented 5 years ago

@sam31415 can you share the code to those functions please?

sam31415 commented 5 years ago

Ok, I created a pull request to include the class in the Catalyst codebase. If you're in a hurry and want to use it right now, here is the code: https://github.com/sam31415/catalyst/blob/develop/research/data_accessor.py There is an example of how to use it in the docstring.