pranjal-joshi / Screeni-py

A Python-based stock screener to find stocks with potential breakout probability from NSE India.
MIT License
557 stars 197 forks source link

Enhancement: running stock screener using multiprocessing to run in parallel. #43

Closed swarpatel23 closed 3 years ago

swarpatel23 commented 3 years ago

Currently, the code uses a single-core to do the screening. but we can run it in parallel to make it faster. I changed file screenipy.py to use all cores using the multiprocessing module. Can I make PR? I will create a new file named screenipy_parallel.py so that your original code does not break.

pranjal-joshi commented 3 years ago

Glad hearing from you! 👍🏼

Sure you can implement multiprocessing. Please go through Contributing Guidelines to get started.

My primary observation is Majority of the time consumed by yfinance module to for downloading stock data depending on the network speed and it's internally using multiple threads by default.

If you successfully implement multiprocessing, please try to post time consumed difference between new code and existing single core just for comparison.

pranjal-joshi commented 3 years ago

Currently, the code uses a single-core to do the screening. but we can run it in parallel to make it faster. I changed file screenipy.py to use all cores using the multiprocessing module. Can I make PR? I will create a new file named screenipy_parallel.py so that your original code does not break.

Also while posting PR, please modify your code in screenipy.py file and send PR to new-features branch. This will automatically run the test cases before merging. Feel free to add more test cases in test folder if required.

pranjal-joshi commented 3 years ago

@swarpatel23 Here are some timing details that I observed for each iteration (Running on Core-i5 (2.4GHz)):

  1. Data Fetching time for each stock (Network Dependent) = 0.25 to 0.5 Sec For this call
  2. Data Processing/Analysis time for each stock = 0.125 Sec From this to this

So it would be great if you can run analysis and fetching as 2 different processes and the Fetching should feed the received data to the analysis process while fetching for the next stock code.

swarpatel23 commented 3 years ago

The approach that I tried is that stock symbols are equally distributed among all processes.

I ran code with option 1: Screen stocks for Breakout or Consolidation

For sequential code, it took: 26 minutes 54 seconds For parallel code, it took (8 processes): 4 minutes 35 seconds

The problem that I have is that because processes do not share a state so "screenCounter" does not work as expected.

pranjal-joshi commented 3 years ago

The time difference measured by you seems like a fantastic improvement.

Before we merge, Can you please look at multiprocessing.Value to fix screenCounter issue? With Value instances of variable can be shared with multiple processes.

Looking forward to merge #44 👍🏼

pranjal-joshi commented 3 years ago

After further development for using multiprocessing on Windows with pyinstaller, the binaries released for #44 in 2d4299d (v1.14)