JECSand / yahoofinancials

A powerful financial data module used for pulling data from Yahoo Finance. This module can pull fundamental and technical data for stocks, indexes, currencies, cryptos, ETFs, Mutual Funds, U.S. Treasuries, and commodity futures.
https://pypi.python.org/pypi/yahoofinancials
MIT License
896 stars 214 forks source link

ValueError: Invalid padding bytes. #132

Closed decodedcoder closed 1 year ago

decodedcoder commented 1 year ago

As of 02/06/2023 this module was working fine but the following day on 02/07/2023, it stopped working with the above error. See snapshot below:

' File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/yahoofinancials/etl.py", line 664, in get_stock_data hist_obj=hist_obj), self.ticker) File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/multiprocessing/pool.py", line 268, in map return self._map_async(func, iterable, mapstar, chunksize).get() File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/multiprocessing/pool.py", line 657, in get raise self._value ValueError: Invalid padding bytes. '

It looks like an issue with the the _decrypt method:

' def _decrypt(encrypted_stores, password, key, iv): if usePycryptodome: cipher = AES.new(key, AES.MODE_CBC, iv=iv) plaintext = cipher.decrypt(encrypted_stores) plaintext = unpad(plaintext, 16, style="pkcs7") else: cipher = Cipher(algorithms.AES(key), modes.CBC(iv)) decryptor = cipher.decryptor() plaintext = decryptor.update(encrypted_stores) + decryptor.finalize() unpadder = padding.PKCS7(128).unpadder() plaintext = unpadder.update(plaintext) + unpadder.finalize() plaintext = plaintext.decode("utf-8") return plaintext '

Specifically, this line --> plaintext = unpadder.update(plaintext) + unpadder.finalize() Perhaps during the decryption/encryption the bytes aren't the same (after some googling). Not 100% sure why this would work the previous day, vs next..

ValueRaider commented 1 year ago

yfinance dev here, our decryption broken too. I noticed Yahoo spam trigger was more sensitive 12 hours ago but still working, then recently broke totally. Haven't investigated, I let others do that.

decodedcoder commented 1 year ago

So here is something interesting. I checked the actual byte sizes of the payload for the encrypt/decrypt and noticed that they were exactly the same size (len() function):

len encrypted_stores:489232 and len plain_text:489232 len encrypted_stores:498528 and len plain_text:498528 len encrypted_stores:494880 and len plain_text:494880 2023-02-07 15:58:02,020 DEBUG urllib3.connectionpool:1007 - Starting new HTTPS connection (1): s.yimg.com:443 len encrypted_stores:500592 and len plain_text:500592 len encrypted_stores:496944 and len plain_text:496944 2023-02-07 15:58:02,085 DEBUG urllib3.connectionpool:465 - https://finance.yahoo.com:443 "GET /quote/KREF/profile?p=KREF&lang=en-US&region=US HTTP/1.1" 200 None 2023-02-07 15:58:02,130 DEBUG urllib3.connectionpool:465 - https://s.yimg.com:443 "GET /uc/finance/dd-site/js/main.e0c853d8cea2b75a5208.min.js HTTP/1.1" 200 None len encrypted_stores:497056 and len plain_text:497056 len encrypted_stores:490688 and len plain_text:490688 len encrypted_stores:497328 and len plain_text:497328 len encrypted_stores:494416 and len plain_text:494416 len encrypted_stores:479872 and len plain_text:479872 len encrypted_stores:487984 and len plain_text:487984 multiprocessing.pool.RemoteTraceback:

Something must have changed on YHOO side recently as even if I print out the payload, it looks fine. Perhaps they nested a list or another dict (not sure if that have significance on actual size, perhaps...).

JECSand commented 1 year ago

@ValueRaider @decodedcoder thanks for the info. I’ve been so slammed with work these last few days. I can look into this tomorrow or over the weekend.

ValueRaider commented 1 year ago

Yahoo have changed which 4 of the 10004 root.App.main keys they combine, not simply last 4 anymore. https://github.com/ranaroussi/yfinance/issues/1407#issuecomment-1426819342 @Meborl

JECSand commented 1 year ago

@ValueRaider I’m working on this now. Trying to find patterns in the JS.

JECSand commented 1 year ago

@ValueRaider I've come to the conclusion that the only way to do this in a worthwhile manner is executing the JS code itself. I'm looking at js2py and PyMiniRacer.

Of the two, my preference would be to find a solution using js2py as in this guide: https://devpress.csdn.net/python/630502f87e6682346619d3dc.html

PyMiniRacer has a lot more overhead and doesn't seem as stable.

There's just no point spending hours rewriting their smoke and mirrors logic in Python, only for them to change a few mirrors around and break it.

JECSand commented 1 year ago

@ValueRaider @decodedcoder I'm able to get the data, refactoring to work around the changes. Hoping to having something up before I get off.

decodedcoder commented 1 year ago

@JECSand - Please let us know when you have a new version ready to download. Appreciate the work and this library!

JECSand commented 1 year ago

@decodedcoder I'm very very close, testing the fix now. Will have a new version ready before I go to sleep tonight. The solution I found will actually make this library work much better, looking forward to releasing it.

JECSand commented 1 year ago

@decodedcoder Just released v1.13 which includes fixes for this issue. I went ahead and refactored YahooFinancials to consume from the Yahoo API instead of web scraping from the context store. The package runs faster now and this approach seems like a more stable long term solution. Much of the data that recently went missing is now available again (such as beta, edit, etc.) i.e. #128.

I tried my hardest to ensure complete backwards compatibility with the only item changing being some numeric values returning as floats now instead of integers. If that becomes too much of an issue feel free to reach out. Additionally if any methods do not behave the way they were before please open a bug issue and I'll address it within 48 hours. Bug fixes going forward should be much easier and quicker than this refactor. I appreciate everyone's patience!

pabgone commented 1 year ago

Outstanding work! Thank you