jewbmx / jewbmx.github.io

JewRepo Source Link
https://jewbmx.github.io/
32 stars 5 forks source link

Scrubs Provider Settings: VidCloud, VidPlay, and UpCloud #23

Open garycnew opened 5 months ago

garycnew commented 5 months ago

Upon reviewing the Scrubs Provider Settings, I didn't see some of my favorite providers like VidCloud, VidPlay, and UpCloud. Are they bundled in provider objects that I am unaware of? Are they excluded on purpose? If not... Would it be possible to submit them as a feature request?

Thanks for a great add-on!

jewbmx commented 5 months ago

If they aren't seen its likely due to them not being used on the sites i scrape or they are labeled there but really rabbitstream like alot of the sites do lately. (rabbit and doki are usually labeled vidcloud and upcloud on alot of em.)

garycnew commented 4 months ago

I believe I understand your provider meta-data better, now. The provider list, within the Scrubs v2 configuration, is a list of the sites you scrape. When selecting a desired title, Scrubs v2 then lists the stream quality/definition, site it was scraped from, and the referenced file source (i.e., VidCloud, VidPlay, etc) found on said site.

Prior to using Scrubs v2, I used fmovies and similar sites directly, for several years, which includes VidPlay sources. I noticed that there is a provider.fmovies object, so I assume that would include the VidPlay sources? Unfortunately, I don't see fmovies as an available option; when, selecting my desired title. Perhaps, the titles that I'm selecting don't have the fmovies sources indexed?

Do you ever need assistance with scraping and indexing sites/sources for Scrubs v2? I specialize in site scraping using various methods (proxies, tor, captcha circumvention, etc) and have a production botfarm for such purposes. It might be interesting to apply such resources to a worthwhile cause.

Also, if not already an included provider, I'd like to recommend adding movies2watch.to to the Scrubs v2 site scrape list.

Thanks, again, for such a great add-on!

phill-nz commented 4 months ago

you should grab a fork and go for it there are not enough stable adons left to endanger any with suss code when they seem so active atm too get them all non functional

phill-nz commented 4 months ago

you should grab a fork and go for it there are not enough stable adons left to endanger any with added code when they seem so active atm too get them all non functional

garycnew commented 4 months ago

Kodi add-ons are not my forte and Scrubs v2 is already great. I was simply thinking my bots might be able to help with further building out and maintaining the Scrubs v2 database. If interested... It would be helpful to know exactly what data is being scraped and the database table architecture.

jewbmx commented 4 months ago

Dont really know what either of you are saying lol but if you wanna help out or get into making scrapers you should first open the ones in the addon and learn from them before attempting anything ;)

garycnew commented 4 months ago

@jewbmx Where does Scrubs v2 get installed in the Kodi directory structure? What are the file names of the scrapers in the add-on that you're referring? Does Scrubs v2 scrape an indexed database of previously referenced provider links or does it scrape the links in real-time? I'm starting to think it's the latter.

Edit: Is the scraper script your are referring: Kodi/addons/plugin.video.scrubsv2/resources/lib/modules/scrape_sources.py? Is this script run each time a movie or episode is selected by Scrubs v2? I can see pro's and con's to implementing it this way. I haven't ever coded in python, but I believe I can add movies2watch.to to the scrape_sources.py script and create a patch or merge file for it. How do you execute and test changes to the scrape_sources.py script? BTW... Love the comments in the code. At least there are comments. :-D Thanks!

garycnew commented 4 months ago

@jewbmx It appears that scrape_sources.py is a module used by the scrapers in Kodi/addons/plugin.video.scrubsv2/resources/lib/sources/working. I thought I might use fmovies_vision.py as an example script, but found that it was in the working/duds/ directory, which I assume means that it broke at some point in time. Then, I remembered that fmovies recently changed their domain name to fmoviesz.to and thought I might attempt updating the fmovies_vision.py script by copying it to the working directory and renaming it to fmoviesz_to.py. I made the relevant domain name changes from fmovies.vision to fmoviesz.to within the fmoviesz_to.py file and added a new entry to settings.xml. No Success.

I then compared the site structure of fmovies.vision to fmoviesz.to:

https://web.archive.org/web/20220522112605/https://fmovies.vision/movie/58-f9-fast-and-furious-9-the-fast-saga.html

https://fmoviesz.to/movie/f9-v9kp4

It appears that fmovies.vision and fmoviesz.to are different sites with fmovies.vision no longer in service.

That being said... I believe there are Pro's and Con's to how Scrubs v2 is currently implemented.

The Pro's are that Scrubs v2 is decentralized and scrapes in real-time, which leads to its Con's.

The Con's are that if any of the providers change the domain name, structure, or code of their sites, it will break the scraper; until, it is fixed and pushed out in a future version of Scrubs v2. Then there is the possibility of providers referencing redundant content and the time that it takes to scrape a provider in real-time. This approach seems very time consuming and fragile.

Would it be more effective to have bots scrape these proviers ahead of time, indexing, checking for redundancies, and then provide the normalized sources back as a single api to Scrubs? Sounds like a web-centric Debrid service.

Please know that even with Scrubs' shortcomings, it's still one of the BEST Kodi add-ons currently available for affordable content and I'm extremely grateful for it. I'm simply wondering if there is a more effective approach? I'd be interested in assisting with a botfarm based backend, but I'm not sure I am up for developing and maintaining the frontend.

I noticed that the scrapers use the requests module for most of the heavy lifting. Which of the scrapers within Scrubs v2 is a good, basic reference for writing fmoviesz_to.py and movies2watch_to.py scripts? I believe they would be excellent additions.

garycnew commented 4 months ago

If they aren't seen its likely due to them not being used on the sites i scrape or they are labeled there but really rabbitstream like alot of the sites do lately. (rabbit and doki are usually labeled vidcloud and upcloud on alot of em.)

@jewbmx I see what you mean... Using Web Developer Tools, I can see the network traffic of the provider site make a request to the embedded source where DoodStream = dood.watch or dood.re, Filemoon = filemoon.sx, MixDrop = mixdrop.co, MyCloud = mcloud.bz, Streamtape = streamtape.com, Supervideo = supervideo.tv, VidCloud = rabbitstream.net, VidPlay = vidplay.online, UpCloud = rabbitstream.net, Upstream = upstream.to, Voe = voe.sx, etc.

jewbmx commented 4 months ago

Think the logic would be the sources module first then scrapers in the folders then the scrape_sources module for scraping further. Final links sometimes need more work or offer more sources within their code like various sources combined into one which sometimes are not supported by resolveurl, and also sometimes that module is used for sources resolveurl doesnt support too or didnt support at the time when i wrote it in lol. As for api based scraping that would probably be done like 2embed and some other sources which arent that great. Also making a cached type of source like a source link collection would work ok but sources are fluid like the scraper sites so they will likely change along the way and have alot of file not found errors down the road. What we really need is a debrid service thats free use with a supportive pay option so people can use the service for free or pay if they wanna.(which personally i think it should be anyways lol but everyone is a money hungry jew)

eeden003 commented 4 months ago

Using a debrid service ? Well, there are a lot of addons, even bad written addons, that work ónly because of debrid services, that's a shame. I know of a repo with a lót of addons with debrid service and they all work but why o why do you need 10 or more addons that only work only thanks to these debrid services ? Sóme addons have "nice" features like -new movies- -comedy- calender- horror- etc. but in basis they all do thesame work thanks to scrapers and debrid services. ScrubsV2 is a good addon that does nót need all these features, it has a nicely made menu and it is easy to operate ánd ik looks ánd finds free links, working links. Yes, sites keep changing and moving but íf someone ór more people could find a solution for this problem, well, thén you have a perfectly addon with free links that wórk. The sites that ScrubsV2 scrapes are also sites that I use on my PC to visit ánd stream/download, free, and wórking ! There are a lot of sites that work on a PC, yes, there are adds/pop-ups and maybe other things, but they stream for free and also in HD. The problem is how to make a addon that can overcome all these pop-ups and unwanted messages etc. without making use of a debrid service. I know of only 1 other addon with free links, Free99, but that addon is not as good as ScrubsV2. I dó know a lot of addons that are "good" ónly because of the fact of debrid services but I do not like them a lot, I like ScrubsV2 much and much more and it should be the number ONE addon that we should need and use as the main addon, it is álways possible to use a second addon with a debrid service, they are imo "easier" to program, but the real challenge is to adapt ScrubsV2 in a way that people wánt to use ScrubV2 as the number ONE-to-to addon because of the fact that it works great ánd with FREE links. There are nearly no addons with (working) free links, some say that the have free links but when testing them no links are found as always is the case, but ScrubsV2 dóes find free working links, so the addon needs to keep living and being used on a daylly basis !

garycnew commented 4 months ago

making a cached type of source like a source link collection would work ok but sources are fluid like the scraper sites so they will likely change along the way and have alot of file not found errors down the road.

I can see your point. Sources can frequently transition to 404. However, I did notice that the archived sources for fmovies.vision are still 200; even, though the fmovies.vision provider is no longer in service.

https://web.archive.org/web/20220522112605/https://fmovies.vision/movie/58-f9-fast-and-furious-9-the-fast-saga.html

I believe that having the actual source url's would be valuable as a backup should a provider go off-line. A list of backup sources would be easier to validate availability/redundancy, manage, and rate.

What we really need is a debrid service thats free use with a supportive pay option so people can use the service for free or pay if they wanna.(which personally i think it should be anyways lol but everyone is a money hungry jew)

I believe the main difference between Scrubs v2 and a debrid service is source reference vs source hosting, which transitions from a legally grey area to a black area. However, there are ways to address this issue as well.

I agree that Scrubs v2 is one of the best affordable media add-ons for Kodi (with Elementum Burst being a close competitor), but Scrubs v2 is quite fragile and requires a great deal of effort to maintain.

I believe Scrubs v2 already has all of the necessary tools to further improve its stability and value to the Kodi free add-on offerings.

@jewbmx Reviewing the Scrubs v2 code, I can see how much work you've devoted to the project. Great work!

garycnew commented 4 months ago

Think the logic would be the sources module first then scrapers in the folders then the scrape_sources module for scraping further. Final links sometimes need more work or offer more sources within their code like various sources combined into one which sometimes are not supported by resolveurl, and also sometimes that module is used for sources resolveurl doesnt support too or didnt support at the time when i wrote it in lol.

@jewbmx I'm trying to get up to speed on programming in python by creating a movies2watch_to.py. How do you typically setup your python development environment? Do you use a python virtual environment to develop Scrubs v2? Do you configure the pythonpath or sys.path to reference the necessary modules?

Kodi/addons/plugin.video.scrubsv2 Kodi/addons/script.module.requests Kodi/addons/script.module.resolveurl Kodi/addons/script.module.kodi-six

I'd like to be able to run $ python3 movies2watch_to.py on the command-line and test the scrape output.

Any development environment pointers and testing would be greatly appreciated.

Thanks, again.

jewbmx commented 4 months ago

If i recall debrid style use is still legal-ish for the basic streaming only process, since the files are all still off your device just being used in a url. Think there is a yourdebrid project on github thats like a open source debrid service or some shit but should be useful if you wanna learn a lil in that area altho i think it was one of them odd prog languages. As for scrubs using debrid id just have to add it back in, i removed all that junk long ago tho because i dont like the idea of supporting peoples greed.

And when it comes to making scrapers you can do it like me which is having kodi on the laptop or pc, then adding some shortcuts to the desktop for some of the scrubs folders like lib so you can access the modules folder and scrapers quickly. Then ya just toss all my made scrapers into a temp folder inside the scraper folder to disable em and when you find a site you can scrape you look thru the pile to find a decent template that matches your needs and copy it to the main folder then start swapping out the old bits for your new stuff and rename it. The settings dont matter in this way and no setting coded in means it will be used so no need to bother making the setting till the end lol especially since the temp folder will disable it for ya if you move it there or to another one. When it comes to test runs i do that all manually too thru the dev menu section i put in with the test lists of some movies and tv shows.

Yall are more then welcome to try improving anything about scrubs and making new scrapers, can toss it to me on here to add into the regular update as well but i wont always add stuff in and might modify it as well to fit my ocd lol. But id say a collaboration/group effort on one addons a whole lot better of an idea compared to the usual approach of forking a addon and making yet another duplicate of shit like most folks do lol

garycnew commented 4 months ago

@jewbmx My python environment is complaining about 'ssl' and 'kodi_six' modules:

Library/Python/3.9/lib/python/site-packages/urllib3/init.py:35: NotOpenSSLWarning: urllib3 v2 only supports OpenSSL 1.1.1+, currently the 'ssl' module is compiled with 'LibreSSL 2.8.3'. See: https://github.com/urllib3/urllib3/issues/3020

and

File "Kodi/addons/plugin.video.scrubsv2/resources/lib/modules/control.py", line 9, in from kodi_six import xbmc, xbmcaddon, xbmcgui, xbmcplugin, xbmcvfs ModuleNotFoundError: No module named 'kodi_six'

Which I have in the PYTHONPATH=Kodi/addons/script.module.kodi-six

jewbmx commented 4 months ago

I wouldn't know anything about all that lol, my kodi install is a basic one from the kodi site on a windows laptop and i only use notepad++ for all my stuff besides firefox for a web browser lol. Ive always seen further python installation types to be a waste of space since i can learn whatever i need to thru a couple good google searches lol

garycnew commented 4 months ago

I wouldn't know anything about all that lol, my kodi install is a basic one from the kodi site on a windows laptop and i only use notepad++ for all my stuff besides firefox for a web browser lol. Ive always seen further python installation types to be a waste of space since i can learn whatever i need to thru a couple good google searches lol

@jewbmx My install is a basic one from the kodi site, but on an Apple Mac Mini and using vi as my editor. I have sorted out the 'ssl' error by uninstalling the latest urllib3 module for and older version that is compatible. I am left with the 'kodi_six' ModuleNotFoundError, which is strange as it is installed and referenced within the PYTHONPATH.

I'm wondering if I need to un/reinstall the 'kodi_six' module to get it to work?

I'm curious.. How do you have your PYTHONPATH and/or sys.path configured for your development environment?

Thanks!

garycnew commented 4 months ago

@jewbmx

I conceded my attempts at testing the movies2watch_to.py script on the command-line in favor of the methodology that you outlined by placing all scripts into a temp directory with only the movies2watch_to.py script remaining within the plugin.video.scrubsv2/resources/lib/sources/working directory.

However, it seems the original providers are cached within Kodi. When I attempt to search for a movie within Kodi > Scrubs v2 all the original providers are still executed and return results. I've even tried restarting Kodi, but the original providers for Scrubs v2 are still present. I was expecting only the single movies2watch_to.py provider, within the working directory, to be executed. Do you know why this is?

Also, will you elaborate on your statement regarding manual test runs:

When it comes to test runs i do that all manually too thru the dev menu section i put in with the test lists of some movies and tv shows.

Thanks, again, for your assistance.

Thanks!

garycnew commented 4 months ago

@jewbmx

Finally making some progress. I figured out that in order to remove the cached providers that the temp directory must be located outside of the "sources" directory and that all the *.pyc files within the "sources/working/pycache" directory must be deleted.

I then found a suitable scraper template by grepping through all the working files, that I had placed in a temp directory called "working.backup," for a similar search_link.

grep -m 1 search_link ../../working.backup/*.py | grep "/search/" | grep -v "feed" | grep -v "html" | grep -v "movies2watch" ../../working.backup/123moviestv_me.py: self.search_link = '/search/%s' ../../working.backup/1movies_la.py: self.search_link = '/search/%s' ../../working.backup/cineb_net.py: self.search_link = '/search/%s' ../../working.backup/cinebox_cc.py: self.search_link = '/search/%s' ../../working.backup/dopebox_to.py: self.search_link = '/search/%s' ../../working.backup/himovies_top.py: self.search_link = '/search/%s' ../../working.backup/myflixer_it.py: self.search_link = '/search/%s' ../../working.backup/primewire_mx.py: self.search_link = '/search/%s' ../../working.backup/tinyzonetv_to.py: self.search_link = '/search/%s' ../../working.backup/watcha_movie.py: self.search_link = '/search/%s' ../../working.backup/winnoise_com.py: self.search_link = '/search/%s'

grep -m 1 search_link ../../working.backup/*.py | grep "/search/" | grep -v "feed" | grep -v "html" | grep -v "movies2watch" | sed "s|:.*||g; s|.*backup/||g;" | while IFS= read -r file; do cp -p ../../working.backup/$file . ; done

I found that winnoise.com was a very close match to movies2watch.to and decided to use winnoise_com.py as my scraper template.

After making the necessary changes to the new movies2watch_to.py script, I tested it within Kodie and found that it successfully scrapes the MixDrop, Upstream, and Voe sources using the movie title "Extraction" (2020). However, it does not scrape the VidCloud (rabbitstream.net) and UpCloud (rabbitstream.net) sources, but it seems to be a problem with the existing winnoise_com.py and all other similar scrapers.

Is there an known issue with the scrapers and rabbitstream.net sources?

How would you like me to submit the movies2watch_to.py script for review?

Thanks!

P.S. I noticed that m4uhd_tv.py and m4ufree_com.py scrapers are not working, which I often use for old and international movies. I tried correcting the self.base_link for m4uhd_tv.py which was configured as 'https://m4uhd.tv' and should be 'https://ww1.m4uhd.tv'. Other than that, the rest of the code looks good including the self.search_link, self.ajax_link, parseDOM csrf-token (_token) and data (m4u). I tried enabling the log_utils.log function, but it doesn't appear to create the scrubsv2.log file. Any pointers on debugging scrapers would be greatly appreciated.

UPDATE:

@jewbmx I figured out how to enable logging within the Scrubs v2 > Add-on > Media Source > Configure > Dev'ish Settings, which produces Library/Logs/scrubsv2.log. I'm still not sure what enabling the "Dev Menu" option does?

I removed all scrapers; except, for m4uhd_tv.py and uncommented and added a few more log_utils.log functions to log the post_link, link, and sources.

The following is what is being reported in the scrubsv2.log:

[2024-02-25 18:25:43] [Scrubs v2 - 5.1.39 - DEBUG]: Source Searching Info = [ movie_title: Extraction | localtitle: Extraction | year: 2020 | imdb: tt8936646 ] [2024-02-25 18:25:44] [Scrubs v2 - 5.1.39 - DEBUG]: post_link: NoneType: None [2024-02-25 18:25:45] [Scrubs v2 - 5.1.39 - DEBUG]: link: NoneType: None [2024-02-25 18:25:45] [Scrubs v2 - 5.1.39 - DEBUG]: sources: NoneType: None [2024-02-25 18:25:45] [Scrubs v2 - 5.1.39 - DEBUG]: post_link: NoneType: None [2024-02-25 18:25:47] [Scrubs v2 - 5.1.39 - DEBUG]: link: NoneType: None [2024-02-25 18:25:47] [Scrubs v2 - 5.1.39 - DEBUG]: sources: NoneType: None [2024-02-25 18:25:47] [Scrubs v2 - 5.1.39 - DEBUG]: post_link: NoneType: None [2024-02-25 18:25:47] [Scrubs v2 - 5.1.39 - DEBUG]: link: NoneType: None [2024-02-25 18:25:47] [Scrubs v2 - 5.1.39 - DEBUG]: sources: NoneType: None

You can see that it successfully preforms the TMDB lookup, finds 3 sources on the page, but all the post_link, link, and sources values are "None." Suggestions? Thanks.