Closed repollo closed 1 year ago
hmmmm I'm going to look at how it's sharing the state.
I see it happening with 2 separate LiveChatScraper objects, gonna investigate.
ahhh I see the issue
tee hee forgot how Python treats variables in classes
Alright I have a branch for this on movingSourceFilesToSrcForCleanup
Nice! Thanks!
Want me to close this or want to leave it open till PR?
sure you can close it. Thanks for the find!
Description:
The
LiveChatScraper
class retains state between successive scrapes, leading to unexpected behavior and potential data corruption. This state retention manifests especially when scraping multiple videos in succession.Steps to Reproduce:
LiveChatScraper
object.scrape
method on a YouTube video with a live chat.LiveChatScraper
object, call thescrape
method on another video.Expected Behavior:
Each call to the
scrape
method should behave as if it's the first time, with no retained state from previous calls.Observed Behavior:
State is retained between calls to the
scrape
method, leading to potential errors and data corruption.Workaround:
A manual call to the
reset
method of theLiveChatScraper
object can be made between successive scrapes to clear the state. However, this is not intuitive and can easily be missed, leading to issues.Proposed Solution:
scrape
method. Callingself.reset()
at the beginning of thescrape
method. And a reset if it fails orException
:LiveChatScraper
class to avoid global state or ensure that a new instance is required for each scrape.Additional Notes:
This issue was identified when scraping multiple videos in succession without manually resetting the state. A temporary fix involving a call to
reset
was implemented, but a more permanent solution in the library would be beneficial. Issue became more apparent when concurrent scraping.