fja05680 / pinkfish

A backtester and spreadsheet library for security analysis.
https://fja05680.github.io/pinkfish
MIT License
264 stars 60 forks source link

Need facility for using data sources other than yfinance/Yahoo #43

Closed EcoFin closed 2 years ago

EcoFin commented 2 years ago

The data handling in pinkfish is too tightly bound to yfinance/Yahoo. It would be a valuable enhancement to be able to deliver data easily to pinkfish from other data sources. Lots of individual investors/students/researchers will be using non-yahoo data sources, many housed in databases, not csv files.

It shouldn't be hard, because all we really need (in the first instance) is to deliver the basic ts dataframe. I ran an experiment using my Norgate database since I already know how to make it deliver data to zipline and backtrader. I have now given up.

I thought I would just write the df out of the Norgate db. That turned out to be problematic so I tried copying a Norgate csv into the cache directory. Here are the problems:

  1. pinkfish has yahoo column names hard-coded. But I do all the data configuration and adjustment in the database before getting anywhere near a backtester. Column names are not the same and I have several timeseries columns pinkfish doesn't know about a priori but that I might want to use; no need to throw them away. To try to move forward, I reconfigured my csv.

  2. The showstopper seems to be the fetch_timeseries and select_timeperiod in Benchmark. Benchmark just doesn't want to use the cache at all and seems to insist on trying to download from yahoo. None of the small code patches I tried have solved the problem.

In short,

Once again, I really enjoy pinkfish and appreciate the elegance of the codebase! But I cannot use it if I'm locked into yahoo data.

btw: backtrader already discovered this and offers 3 or 4 generic data interfaces. At the other end of the spectrum, zipline's data handling is almost impossible. Easy, flexible pandas-based data handling could really differentiate pinkfish.

best regards ay

fja05680 commented 2 years ago

Hi ay,

I really appreciate the time you have taken to look into pinkfish and your kind words regarding the code base. I'm sorry it isn't able at this time to meet the needs that you have identified. Please understand that I'm essentially the lone developer and work on it as a hobby outside of my full time profession and family. I developed it for my own use and usually only add new features when I need them. That really is the only time I have to give it. For my style of investing (short term to medium term ETFs, it does everything I need). I understand this means it will have limited appeal and a relatively small user base. That's fine. I wanted to share what I have done in case anyone else had the same requirements.

That said, I will look over the points you have made when I get the chance. My quick reading of what you have said, I think you have made some valid issues that either I haven't considered or wasn't aware of. Thanks for making me aware of these issues.

Farrell

EcoFin commented 2 years ago

I certainly understand. Some things I have been able to adjust easily with minimal intervention. But I wanted to avoid anything that might make future upgrades difficult. If I were better at writing python than I actually am, I’d offer to make some code contributions since I have a simple data handling design in mind. If I think a bit more about it over the next day or two, maybe I will find some easier adjustments than I saw over the weekend. It is a lot more pleasant to work with pinkfish, especially for research purposes, than with the complex frameworks.

Best regards

arthur

From: Farrell Aultman @.> Sent: Monday, July 26, 2021 12:40 AM To: fja05680/pinkfish @.> Cc: EcoFin @.>; Author @.> Subject: Re: [fja05680/pinkfish] Need facility for using data sources other than yfinance/Yahoo (#43)

Hi ay,

I really appreciate the time you have taken to look into pinkfish and your kind words regarding the code base. I'm sorry it isn't able at this time to meet the needs that you have identified. Please understand that I'm essentially the lone developer and work on it as a hobby outside of my full time profession and family. I developed it for my own use and usually only add new features when I need them. That really is the only time I have to give it. For my style of investing (short term to medium term ETFs, it does everything I need). I understand this means it will have limited appeal and a relatively small user base. That's fine. I wanted to share what I have done in case anyone else had the same requirements.

That said, I will look over the points you have made when I get the chance. My quick reading of what you have said, I think you have made some valid issues that either I haven't considered or wasn't aware of. Thanks for making me aware of these issues.

Farrell

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/fja05680/pinkfish/issues/43#issuecomment-886368684 , or unsubscribe https://github.com/notifications/unsubscribe-auth/AB3HOERGJGMHC6O5XZ7BAATTZTRILANCNFSM5A7DDECA .

fja05680 commented 2 years ago

If you can describe what.is that you have in mind regarding data handling that would be helpful.

On Mon, Jul 26, 2021, 8:32 AM EcoFin @.***> wrote:

I certainly understand. Some things I have been able to adjust easily with minimal intervention. But I wanted to avoid anything that might make future upgrades difficult. If I were better at writing python than I actually am, I’d offer to make some code contributions since I have a simple data handling design in mind. If I think a bit more about it over the next day or two, maybe I will find some easier adjustments than I saw over the weekend. It is a lot more pleasant to work with pinkfish, especially for research purposes, than with the complex frameworks.

Best regards

arthur

From: Farrell Aultman @.> Sent: Monday, July 26, 2021 12:40 AM To: fja05680/pinkfish @.> Cc: EcoFin @.>; Author @.> Subject: Re: [fja05680/pinkfish] Need facility for using data sources other than yfinance/Yahoo (#43)

Hi ay,

I really appreciate the time you have taken to look into pinkfish and your kind words regarding the code base. I'm sorry it isn't able at this time to meet the needs that you have identified. Please understand that I'm essentially the lone developer and work on it as a hobby outside of my full time profession and family. I developed it for my own use and usually only add new features when I need them. That really is the only time I have to give it. For my style of investing (short term to medium term ETFs, it does everything I need). I understand this means it will have limited appeal and a relatively small user base. That's fine. I wanted to share what I have done in case anyone else had the same requirements.

That said, I will look over the points you have made when I get the chance. My quick reading of what you have said, I think you have made some valid issues that either I haven't considered or wasn't aware of. Thanks for making me aware of these issues.

Farrell

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub < https://github.com/fja05680/pinkfish/issues/43#issuecomment-886368684> , or unsubscribe < https://github.com/notifications/unsubscribe-auth/AB3HOERGJGMHC6O5XZ7BAATTZTRILANCNFSM5A7DDECA> .

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/fja05680/pinkfish/issues/43#issuecomment-886661665, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACD3KHSNHKPXG2YERFJJ7QTTZVIWVANCNFSM5A7DDECA .

EcoFin commented 2 years ago

Farrell,

Maybe I’ll try to do a little diagram. But I imagine a 2D timeseries dataframe like the current ts with “symbol identified” columns. I would want to be able to fill that df either from yfinance/Yahoo downloads or from any other datasource either directly (into the df) or indirectly (via read_csv). There is no particular reason not to use the yahoo column names for price data. If someone (me, say) wants to provision from another datasource, it would be my business to provide price data with the correct names.

It might make sense to have two dataframes, one provisioned “directly” the other from csv, that could just be concatenated.

Whether it was easier to have one price data df and one “other (fundamental/economic/alternative) data” df, or put it all in one doesn’t matter too much. Somewhere in the current code I found that it was already set up to let users add their own indicators. That is how I would expect to use the second df in fact: generate indicators/signals from it to add to ts.

In effect, I’d put all the data handling code that prepares the ts dataframe into one level and then just pass ts to the backtester.

I know that is a bit easier to say than to do. I pulled the data provisioning code out of one of my equity/ETF screening gizmos to try to deliver a ts dataframe, thinking I needed to modularize it anyway. Of course, I then was reminded that there were actually 3 parts to that process: prices, metadata and some fundamental data. For backtesting we could skip the metadata for simplicity.

I could deliver the ts easily enough. But then ran into problems with select_timperiod and Benchmark that I wasn’t expecting.

Does that help at all?

Regards

ay

From: Farrell Aultman @.> Sent: Monday, July 26, 2021 8:50 AM To: fja05680/pinkfish @.> Cc: EcoFin @.>; Author @.> Subject: Re: [fja05680/pinkfish] Need facility for using data sources other than yfinance/Yahoo (#43)

If you can describe what.is that you have in mind regarding data handling that would be helpful.

On Mon, Jul 26, 2021, 8:32 AM EcoFin @.***> wrote:

I certainly understand. Some things I have been able to adjust easily with minimal intervention. But I wanted to avoid anything that might make future upgrades difficult. If I were better at writing python than I actually am, I’d offer to make some code contributions since I have a simple data handling design in mind. If I think a bit more about it over the next day or two, maybe I will find some easier adjustments than I saw over the weekend. It is a lot more pleasant to work with pinkfish, especially for research purposes, than with the complex frameworks.

Best regards

arthur

From: Farrell Aultman @.> Sent: Monday, July 26, 2021 12:40 AM To: fja05680/pinkfish @.> Cc: EcoFin @.>; Author @.> Subject: Re: [fja05680/pinkfish] Need facility for using data sources other than yfinance/Yahoo (#43)

Hi ay,

I really appreciate the time you have taken to look into pinkfish and your kind words regarding the code base. I'm sorry it isn't able at this time to meet the needs that you have identified. Please understand that I'm essentially the lone developer and work on it as a hobby outside of my full time profession and family. I developed it for my own use and usually only add new features when I need them. That really is the only time I have to give it. For my style of investing (short term to medium term ETFs, it does everything I need). I understand this means it will have limited appeal and a relatively small user base. That's fine. I wanted to share what I have done in case anyone else had the same requirements.

That said, I will look over the points you have made when I get the chance. My quick reading of what you have said, I think you have made some valid issues that either I haven't considered or wasn't aware of. Thanks for making me aware of these issues.

Farrell

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub < https://github.com/fja05680/pinkfish/issues/43#issuecomment-886368684> , or unsubscribe < https://github.com/notifications/unsubscribe-auth/AB3HOERGJGMHC6O5XZ7BAATTZTRILANCNFSM5A7DDECA> .

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/fja05680/pinkfish/issues/43#issuecomment-886661665, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACD3KHSNHKPXG2YERFJJ7QTTZVIWVANCNFSM5A7DDECA .

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/fja05680/pinkfish/issues/43#issuecomment-886673715 , or unsubscribe https://github.com/notifications/unsubscribe-auth/AB3HOEUUMZINFFUJECEUXXDTZVKX5ANCNFSM5A7DDECA .