CodyBerenson / PGMA-Modernized

An updated approach for Plex Gay Media Adult Agents for both Full Feature Films and Scenes
MIT License
132 stars 46 forks source link

Update image cropping for QC, WB, and Fagalicious #1

Closed CodyBerenson closed 4 years ago

CodyBerenson commented 4 years ago

@JPH71, now that Aiden has absolved all responsibilities and knowledge of us (I'm tempted to @ him here, just to piss him off), we've had to establish our own repository for your amazing Agents.

While i check out options for cloud hosting the Thumbor instance, can you please update the Agents to use the http://34.67.235.246:8888/unsafe/ Thumbor instance, and perhaps include your .vbs as a backup?

CodyBerenson commented 4 years ago

@j-ktz, here's the new repository. You are customer 00000001.

JPH71 commented 4 years ago

Sweet I will start on that now... Go on... I dare you @ him

On Sat, 14 Mar 2020, 18:36 CodyBerenson, notifications@github.com wrote:

@j-ktz https://github.com/j-ktz, here's the new repository. You are customer 00000001.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/CodyBerenson/PGMA-Modernized/issues/1#issuecomment-599106255, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKI3AKJKTL7DF4EPZSJSBNLRHO6BHANCNFSM4LJOOPJQ .

JPH71 commented 4 years ago

Hi Cody

I have made the changes to QueerClick, wayBig abd Fagalicious. Tested QueerClick and WayBig all working fine Fagalicious I am getting a 503 Service Unavailable error well before it gets to the changes I have made, Could you check that your current fagalicious is working fine? Then try the new one?

All three: If the height of the image is greater than the following ratios:

The agents will first try and use Thumbor, if that fails and the operating system is windows it will use the vbscript at that only runs on windows. We will need to create an apple script for MacOS and one for Linux systems to cater for those two. If thumbor fails and the system is non-windows - it will use the original image size.... Sorry @j-ktz I think you have a MAC

Here are the three bundles QueerClick.bundle.zip WayBig.bundle.zip Fagalicious.bundle.zip

Happy testing @CodyBerenson

CodyBerenson commented 4 years ago

just checked the old fagalicious script.. works fine. will test the rest tomorrow. xoxoxoxo

CodyBerenson commented 4 years ago

@JPH71 The new Fagalicious bundle worked fine.

I noticed that you took requirement out of the preferences for a windows user to point the agent at its local install of PMS. So, it doesn't matter where the PMS was installed?

CodyBerenson commented 4 years ago

all three updated agents seem to work just fine. Yay!

CodyBerenson commented 4 years ago

Ok, so i've updated the master code with the three agents, and rewrote the readme to be specific to Jason's agents and to remove traces of Aiden.

CodyBerenson commented 4 years ago

Finally, I've left a bread crumb to this new repository on the existing (and I believe unsupported) master site that is what folks get a hit on when they google Plex Agents: https://github.com/LGBT-PlexPlugins/plex-gay-metadata-agent/issues/61#issue-582474164

We've built it....lets see if they come.

JPH71 commented 4 years ago

Super brilliant!

Yes as the imagecropper vbscript file was resident in the same folder as the init.py file, I thought it was superfluous to make it a preference. I will look into scripting a MacOS version of it and a Linux one, so that users of those systems have also got the backup

Glad to know that fagalicious worked without problems.....

Do you have any idea why Aiden acted like someone had pissed in his morning tea that day?? It has really confounded me as he seemed so positive to be back....

Do you need the other bundles zipped up and sent - or have you already uploaded them?

Thanks for all the help - and lets hope more people join up!

Jason xxxx

On Mon, 16 Mar 2020 at 18:26, CodyBerenson notifications@github.com wrote:

Finally, I've left a bread crumb to this new repository on the existing (and I believe unsupported) master site that is what folks get a hit on when they google Plex Agents: LGBT-PlexPlugins/plex-gay-metadata-agent#61 (comment) https://github.com/LGBT-PlexPlugins/plex-gay-metadata-agent/issues/61#issue-582474164

We've built it....lets see if they come.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/CodyBerenson/PGMA-Modernized/issues/1#issuecomment-599664926, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKI3AKOVCUV5PIXGB3LDHRDRHZONLANCNFSM4LJOOPJQ .

CodyBerenson commented 4 years ago

Unless you've made changes to any of the under bundles, we should be good (if you recall, you zipped up the latest and greatest for j-ktz recently).

I'm convinced that GitHub came down hard on Aiden because of the violations that screencaps would have caused. I'm guessing that in the nature of his work, keeping his GitHub presence is important to him...and he didn't need to be affiliated with any more strikes against him. I made sure that the images and the language in the readme are clean.

In a day or so I'll close this issue and open a new one for image cropping across other platforms. I'll have no way to test them, though.

JPH71 commented 4 years ago

Neither will I... @j-ktz has a Mac me thinks... I will find out how to write a script and he can yes it for us

On Mon, 16 Mar 2020, 19:55 CodyBerenson, notifications@github.com wrote:

Unless you've made changes to any of the under bundles, we should be good (if you recall, you zipped up the latest and greatest for j-ktz recently).

I'm convinced that GitHub came down hard on Aiden because of the violations that screencaps would have caused. I'm guessing that in the nature of his work, keeping his GitHub presence is important to him...and he didn't need to be affiliated with any more strikes against him. I made sure that the images and the language in the readme are clean.

In a day or so I'll close this issue and open a new one for image cropping across other platforms. I'll have no way to test them, though.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/CodyBerenson/PGMA-Modernized/issues/1#issuecomment-599706095, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKI3AKKLLTUXFKNKPAUQK3LRHZY3RANCNFSM4LJOOPJQ .

j-ktz commented 4 years ago

Hi friends! Yeah, I can totally test. Just let me know what I have to do. Sorry still a newb.

CodyBerenson commented 4 years ago

@jph71 every once in a blue moon I have file that GEVI won't match.... interested in troubleshooting? i'm OK that the agent works almost always and we don't need perfection. Let me know your thoughts...

xoxo

JPH71 commented 4 years ago

Send it over... Only way to get it working properly is to hit it with a wobbled

On Tue, 17 Mar 2020, 12:29 CodyBerenson, notifications@github.com wrote:

@JPH71 https://github.com/JPH71 every once in a blue moon I have file that GEVI won't match.... interested in troubleshooting? i'm OK that the agent works almost always and we don't need perfection. Let me know your thoughts...

xoxo

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/CodyBerenson/PGMA-Modernized/issues/1#issuecomment-600020591, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKI3AKNUKHSPAFRF7MTE2JTRH5NLHANCNFSM4LJOOPJQ .

CodyBerenson commented 4 years ago

Is that a Harry Potter reference?

LOL.

Here are three. Not sure if the word "The" is throwing (gevi won't allow you to use the word "The" when manually searching)...

The Apprentice 1 and 2, both by Delta Productions The Auto Files 2 by Rudebox Media

And NO Proof of the Pudding Images! LOL LOL LOL

JPH71 commented 4 years ago

No it ain't... I meant to spell wobbler rather than wobbled...

I need to look at GEVI again... It was written a long while back... See how it can be improved...

Have you come across any other blogs... Sites that can be scrapped? This also applies to @J-ktz

Otherwise we are still breathing though working from home... With all this Corona virus malarkey...

On Tue, 17 Mar 2020, 13:32 CodyBerenson, notifications@github.com wrote:

Is that a Harry Potter reference?

LOL.

Here are three. Not sure if the word "The" is throwing (gevi won't allow you to use the word "The" when manually searching)...

The Apprentice 1 and 2, both by Delta Productions The Auto Files 2 by Rudebox Media

And NO Proof of the Pudding Images! LOL LOL LOL

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/CodyBerenson/PGMA-Modernized/issues/1#issuecomment-600044971, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKI3AKNDYRM6P7Z2PV2DHKLRH5UUFANCNFSM4LJOOPJQ .

CodyBerenson commented 4 years ago

Here's another, also beginning with "The" and containing a Number: The First Time 2 by XXX-Project

JPH71 commented 4 years ago

Cool... Will investigate tomorrow and make changes xx

On Tue, 17 Mar 2020, 18:00 CodyBerenson, notifications@github.com wrote:

Here's another, also beginning with "The" and containing a Number: The First Time 2 by XXX-Project

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/CodyBerenson/PGMA-Modernized/issues/1#issuecomment-600185299, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKI3AKPW4M5OAG42PE7MRVLRH6UCZANCNFSM4LJOOPJQ .

CodyBerenson commented 4 years ago

NO ESC! :(

HEre's another "the", minus a movie number.

(Arena Entertainment) - The Sauna (2003).mp4

JPH71 commented 4 years ago

I have had enough VBA coding... Time to switch to python... Just after supper... Belgium went into lockdown at lunch time... Had to do a quick dash to the shop for vodka cigs RedBull and coca cola... Really got my prioritise right... 😄

On Wed, 18 Mar 2020, 20:57 CodyBerenson, notifications@github.com wrote:

NO ESC! :(

HEre's another "the", minus a movie number.

(Arena Entertainment) - The Sauna (2003).mp4

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/CodyBerenson/PGMA-Modernized/issues/1#issuecomment-600829860, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKI3AKJW7RUNHRF4T7BQYDTRIERTFANCNFSM4LJOOPJQ .

CodyBerenson commented 4 years ago

if you were a red neck american, it would have been toiletpaper as priority #1

JPH71 commented 4 years ago

Lol... Well gaggle with it morning and evening... Hopefully put the virus to shame 😄

Medicinal purposes... Yeah right my little brown proverbial!

On Wed, 18 Mar 2020, 21:25 CodyBerenson, notifications@github.com wrote:

if you were a red neck american, it would have been toiletpaper as priority #1 https://github.com/CodyBerenson/PGMA-Modernized/issues/1

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/CodyBerenson/PGMA-Modernized/issues/1#issuecomment-600841605, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKI3AKMQECIR3RQJ4LZQM4LRIEU5HANCNFSM4LJOOPJQ .

CodyBerenson commented 4 years ago

lordie. first i lost aiden (would we qualify that as a loss in fact?)...and now i've lost my most fragile and imperfect Jason. this is why we can't let the far east into ESC. unless they are going to buy me better vpn

JPH71 commented 4 years ago

I am still here cupcakes... Had a busy 3 days at work... But round the corner and ready to get back to GEVI... let me know of other sites to scrape ... Better send a call out to @j-ktz

We are in soft lock down still here in Belgium... Waiting for the storm to hit...

On Sat, 21 Mar 2020, 02:27 CodyBerenson, notifications@github.com wrote:

lordie. first i lost aiden (would we qualify that as a loss in fact?)...and now i've lost my most fragile and imperfect Jason. this is why we can't let the far east into ESC. unless they are going to buy me better vpn

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/CodyBerenson/PGMA-Modernized/issues/1#issuecomment-601973690, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKI3AKLVGNRJVW44PKMVF2TRIQJYTANCNFSM4LJOOPJQ .

JPH71 commented 4 years ago

😘😘😘😘😘😘

On Sat, 21 Mar 2020, 10:23 Jason Hudson, jp.hudson@gmail.com wrote:

I am still here cupcakes... Had a busy 3 days at work... But round the corner and ready to get back to GEVI... let me know of other sites to scrape ... Better send a call out to @j-ktz

We are in soft lock down still here in Belgium... Waiting for the storm to hit...

On Sat, 21 Mar 2020, 02:27 CodyBerenson, notifications@github.com wrote:

lordie. first i lost aiden (would we qualify that as a loss in fact?)...and now i've lost my most fragile and imperfect Jason. this is why we can't let the far east into ESC. unless they are going to buy me better vpn

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/CodyBerenson/PGMA-Modernized/issues/1#issuecomment-601973690, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKI3AKLVGNRJVW44PKMVF2TRIQJYTANCNFSM4LJOOPJQ .

CodyBerenson commented 4 years ago

FYI, my little unicorn, Aiden's thumbor instance (https://cdn.vigue.me/unsafe/0x0:800x1200/url.jpg) is back online. Not suggest we change a damn thing...just wanted you to know.

j-ktz commented 4 years ago

Hey everyone! I'm alive! Work has just been crazy this week, haven't had much free time. It's strange, now my agents are scraping ads and using those as thumbnails. Was that a bug that was fixed and a newer release of agents? Maybe I'm using an old one but every time I replace I have issues. It was the latest batch from the old threat. I'd post a screenshot but don't want to get anyone in trouble?

CodyBerenson commented 4 years ago

Hi j-ktz. The lastest agents are in the code tab for this repository.... Once or twice a week or two ago I had an ad from waybig show up as the poster. i ended up finding the same video on either fagalicious or queerclick, changed the name, rescanned the library, and it grabbed the correct poster from the different blog.

A quick fix when an agent doesn't grab a poster:

  1. in notepad or something similar, copy the link two messages about this
  2. copy the "image location" from the blog's poster
  3. replace url.jpg from #1 by pasting what you copied in #2
  4. copy the entire string
  5. use this as the URL for the poster. It should bring in a cropped poster from the blog directly into plex.
JPH71 commented 4 years ago

Which of the agents did this... I noticed waybig was doing this but I corrected it. To find out which one did.. look at the tagline field of the movie.. click on the pen symbol and look at the tagline I think.. it will have the website on it

👍

On Sat, 21 Mar 2020, 18:47 j-ktz, notifications@github.com wrote:

Hey everyone! I'm alive! Work has just been crazy this week, haven't had much free time. It's strange, now my agents are scraping ads and using those as thumbnails. Was that a bug that was fixed and a newer release of agents? Maybe I'm using an old one but every time I replace I have issues. It was the latest batch from the old threat. I'd post a screenshot but don't want to get anyone in trouble?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/CodyBerenson/PGMA-Modernized/issues/1#issuecomment-602078475, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKI3AKLBVUV6GW3WY6L6ZK3RIT4UNANCNFSM4LJOOPJQ .

JPH71 commented 4 years ago

Bugger that...😆

On Sat, 21 Mar 2020, 15:00 CodyBerenson, notifications@github.com wrote:

FYI, my little unicorn, Aiden's thumbor instance ( https://cdn.vigue.me/unsafe/0x0:800x1200/url.jpg) is back online. Not suggest we change a damn thing...just wanted you to know.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/CodyBerenson/PGMA-Modernized/issues/1#issuecomment-602047317, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKI3AKML3HVXE3LXV5P23XDRITCBRANCNFSM4LJOOPJQ .

JPH71 commented 4 years ago

Hi Aiden,

Nice to hear from you.. I had really wondered if we members of the collective had done owt to offend you ..

Presently, I am in my second week of lockdown here in Belgium and it keeps being tightened..

At the flat I rent in residence, we have shared and now we are having issues with bandwidth as people have not being able to travel back home at the weekend...

I just seen your message, when I woke up...

The Google idea is brilliant as it keeps the search in one place. Have you written the code for this or was it an idea?

Keep in touch

Jason

On Sat, 21 Mar 2020, 22:57 Aiden Vigue, notifications@github.com wrote:

Extremely sorry guys, i was in (unrelated) legal issues, and lawyer thought it best to make the repo private. Except, I clicked delete. :( Thumbor & Facebox will be up permanently, I simply forgot to pay the measly $0.03 cent charge from Google. Also, would anyone be interested in using Google to search instead of the WayBig / Fagalicious / QueerClick site search? For example, instead of searching one at a time, you would query Google for

site:waybig.com OR site:fagalicious.com OR site:queerclick.com YOUR QUERY

I have already setup a captcha solving service, so if Google blocks the search engine result page scraping, the script will send the captcha to 2captcha to be solved. On my library of (now) ~1200 videos, only 10 were left without posters etc. And, best of all, the files need not be named anything special beyond the (Studio) - Name (Year).ext format.

All of these will match: (Sean Cody) - Kieran & Asher (2018).mp4 (Sean Cody) - Kieran and Asher (2018).mp4 (Sean Cody) - Asher and Kieran (2018).mp4 (Sean Cody) - Kieran & Asher (2018).mp4 (Sean Cody) - Kieran Fucks Asher Bareback (2018).mp4

Also, cast members & synopsis are found by Googling site:iafd inurl:title.rme QUERY and fetching from there. Alternate sources such as waybig description or GEVI are planned.

The agent is written in PHP and needs a web host to work, Also, a PC / server is needed. But, on the plus side, no file renames are needed ever, metadata updates within 1 minute. I also would like to say sorry AGAIN for deleting it without notice, I was going through manic depression and as such made very bad choices that involved the law. GitHub was not involved in the disappearance of this repository.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/CodyBerenson/PGMA-Modernized/issues/1#issuecomment-602110467, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKI3AKMJLKIROVO4L77EDPTRIUZ5VANCNFSM4LJOOPJQ .

ghost commented 4 years ago

Code is written.

ghost commented 4 years ago

I have only used 6 of the 2000 captcha credits, so I would be happy to send you the URL and code.

JPH71 commented 4 years ago

Perfect... That would be brilliant When you get time could you look into the cookie issue I had with GayVODclub... I will send the code again.

I do hope you have managed to sort out the issues you mentioned and are less stressed

Jason

On Sun, 22 Mar 2020, 14:23 Aiden Vigue, notifications@github.com wrote:

I have only used 6 of the 2000 captcha credits, so I would be happy to send you the URL and code.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/CodyBerenson/PGMA-Modernized/issues/1#issuecomment-602200016, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKI3AKIHTH4GZ3ESJ5XEFPDRIYGOLANCNFSM4LJOOPJQ .

ghost commented 4 years ago

I have. Fixing some last minute parsing issues. Should be done soon.

Also, the only studio names that matter (and DVDs should work, I haven't put in DVD specific scrapers yet, working on that after I post the code) are

Scenes and DVDs by Helix should have the studio name of Helix Studios in the filename Scenes by Staxus should be Staxus Scenes by 8TeenBoy should be 8TeenBoy Scenes by CockyBoys should be CockyBoys.

Other than that, they do not matter. Also, year does not matter, and title does not matter (as long as it is basically original title from studio site) AKA it doesn't have to match with waybig title or fagalicious title.

ghost commented 4 years ago

plexmagic.zip

ghost commented 4 years ago

PHP script code for who is interested:

<?php
require("../vendor/autoload.php");
if($_GET['t']) {
ini_set('display_errors', 1);
ini_set('display_startup_errors', 1);
error_reporting(E_ALL);
}
use Symfony\Component\HttpClient\HttpClient;
use Intervention\Image\ImageManager;
header("Access-Control-Allow-Origin: *");
$query = $_GET['q'];
$studio = $_GET['studio'];

if($studio == "") {
    $response = array("error" => 1);
    echo json_encode($response);
}

$metadata = array("cast"=>array(),"images"=>array(), "bg"=>array(), "genres"=>array());

if(!isset($_GET['t']) && file_exists("saved/" . urlencode(str_replace("&","and",$query)) . ".json")) {
    header("Content-Type: application/json");
    echo file_get_contents("saved/" . urlencode($query) . ".json");
    die();
}

//STAXUS
if($studio == "Staxus") {
    $client = new \Goutte\Client();
    $urls = googleSearch("site:staxus.com+$query");
    $url = $urls[0];
    $crawler = $client->request('GET', $url);

    //Video Title
    $metadata["title"] = $crawler->filter("div[class='video-descr__title'] > div > div")->first()->text();
    //Video Description
    $metadata["summary"] = stripDiac($crawler->filter("div[class='video-descr__content']")->text());
    //Release Date
    $date = DateTime::createFromFormat('d/F/Y', explode(": ", trim($crawler->filter("div[class='video-details']")->first()->text()))[1]);
    $metadata["released"] = $date->format("d-m-Y");

    //Rating
    $metadata["rating"] = floatval(trim($crawler->filter("span[class='video-grade-average'] > strong")->text())) * 20;

    //Cast
    $crawler->filter("div[class='video-descr__model-item']")->each(function($node) {
        global $metadata;
        $url = $node->filter("div[class='thumb']")->attr("style");
        $url = explode("url('", $url)[1];
        $url = explode("'", $url)[0];
        $url = "https:" . $url;
        $name = $node->filter("p > a")->text();
        $member = array("image"=>cropHead($url), "name"=>$name, "role"=>"");
        array_push($metadata["cast"], $member);
    });
    //Images
    $crawler->filter("div[class='gallery-image col-md-4 col-sm-6 aspect-ratio'] > a")->each(function($node) {
        global $metadata;
        $url = $node->attr("style");
        $url = explode("url('", $url)[1];
        $url = explode("'", $url)[0];
        $url = str_replace("thumbs","1024watermarked", $url);
        $url = "https:" . $url;
        array_push($metadata["images"], "https://cdn.vigue.me/unsafe/" . $url);
    });
}

//8TeenBoy
else if($studio == "8TeenBoy") {
    $client = new \Goutte\Client();
    $url = "";
    $results = googleSearch("site:8teenboy.com+$query");
    foreach($results as $result) {
        if($url == "") {
            if(strpos($result, "video/") !== false) {
                $url = $result;
            }
        }
    }
    //echo $url;
    $crawler = $client->request('GET', $url);

    //Video Title
    $metadata["title"] = $crawler->filter("h2.pull-left")->first()->text();
    $metadata["title"] = stripDiac($metadata["title"]);
    //Video Description
    $metadata["summary"] = $crawler->filter("p[class='scene-description hide show-md']")->text();
    $metadata["summary"] = str_replace("’","'", $metadata["summary"]);
    //Release Date
    $metadata["released"] = "";
    $iafd = $client->request('GET', 'http://www.iafd.com/results.asp?searchtype=comprehensive&searchstring=' . urlencode($metadata["title"]));
    $iafd->filter("#titleresult > tbody > tr")->each(function($node) {
        global $query, $metadata, $client;
        $title = $node->filter("td")->eq(3)->text();
        if($title == "") {
            $title = $node->filter("td")->eq(0)->text();
        }
        $distro = $node->filter("td")->eq(2)->text();
        $metadata["title"] = str_replace("and","&",$metadata["title"]);
        if(strtolower($title) == strtolower($metadata["title"]) && $distro == "helixstudios.net") {
            $href = "http://www.iafd.com/" . $node->filter(".pop-execute")->first()->attr("href");
            $iafd = $client->request('GET', $href);
            try {
                $releaseDate = $iafd->filter(".biodata")->eq(8)->text();
                if($releaseDate != "No Data") {
                    $date = DateTime::createFromFormat('M d, Y', $releaseDate);
                    $metadata["released"] = $date->format("d-m-Y");
                }
            } catch(Exception $e) {

            }
            $metadata["title"] = $iafd->filter("h1")->first()->text();

            $nodes = $iafd->filter(".castbox");
            $nodes->each(function($node) {
                //echo "node";
                global $metadata;
                $name = trim($node->filter("a")->text());
                $src = $node->filter("img")->attr("src");
                $role = $node->text();
                $role = str_replace($name, "", $role);
                $role = trim($role);
                $role = urlencode($role);
                $role = str_replace("%C2%A0", "", $role);
                $member = array("name" => $name, "image" => $src, "role" => $role);
                array_push($metadata["cast"], $member);
            });
            return;
        }
    });
    //Rating
    $metadata["rating"] = 100;
    //Cast
    /*
    $crawler->filter("div[class='thumbnail-grid thumbnail-grid pure-g']")->first()->filter("div > div > a")->each(function($node) {
        global $metadata;
        $url = $node->filter("div[class='thumbnail'] > img")->attr("src");
        $url = str_replace("img/200w/","",$url);
        $url = str_replace("https", "http", $url);
        $url = cropHead($url);
        $name = $node->filter("div[class='thumbnail-bottom-text']")->text();
        $member = array("image"=>$url, "name"=>$name);
        array_push($metadata["cast"], $member);
    });
    */
    //Images
    $crawler->filter("img[src*='https://cdn.8teenboy.com/img/250h/media/stills/']")->each(function($node) {
        global $metadata;
        $url = $node->attr("src");
        $url = str_replace("img/250h/", "img/1920w/", $url);
        array_push($metadata["images"], "https://cdn.vigue.me/unsafe/" . $url);
    });
}

//Helix Studios
else if($studio == "Helix Studios") {
    $client = new \Goutte\Client();
    $query = str_replace("and","&",$query);
    $url = "";
    $results = googleSearch("site:helixstudios.net+$query");
    foreach($results as $result) {
        if($url == "") {
            if(strpos($result, "video") !== false || strpos($result, "movie") !== false) {
                $url = $result;
            }
        }
    }
    if(isset($_GET['url'])) {
        $url = $_GET['url'];
    }
    //echo $url;
    $crawler = $client->request('GET', $url);
    if(strpos($url, "HXM") !== false) {
        //DVD Title
        $metadata["title"] = $crawler->filter("div.boxContent > h3")->first()->text();
        //DVD Description
        $metadata["summary"] = $crawler->filter("p[class='description']")->text();
        $metadata["summary"] = str_replace("’","'", $metadata["summary"]);
        //Release Date
        $date = $crawler->filter("div.boxContent > div")->first()->text();
        $date = trim(explode(": ", $date)[1]);
        $date = DateTime::createFromFormat('F j, Y', $date);
        $metadata["released"] = $date->format("d-m-Y");

        //Rating
        $metadata["rating"] = 100;
        //Cast
        $crawler->filter("#scene-models > li")->each(function($node) {
            global $metadata;
            $url = $node->filter("a > img")->attr("src");
            $url = str_replace("/img/150w","",$url);
            $url = cropHead($url);
            $name = $node->filter("a > div")->text();
            $member = array("image"=>$url, "name"=>$name, "role"=>"");
            array_push($metadata["cast"], $member);
        });
        //Images
        $id = explode("/", explode("movie/", $url)[1])[0];
        array_push($metadata["images"], "https://cdn.vigue.me/unsafe/https://cdn.helixstudios.com/media/covers/".$id."_back_xlarge.1539280596.jpg");
        array_push($metadata["images"], "https://cdn.vigue.me/unsafe/https://cdn.helixstudios.com/media/covers/".$id."_front_xlarge.1539280596.jpg");
    } else {
        //echo $url;
        //Video Title
        $metadata["title"] = $crawler->filter(".scene-title")->text();
        //Video Description
        $metadata["summary"] = $crawler->filter("tr")->eq(1)->text();
        $metadata["summary"] = str_replace("’","'", $metadata["summary"]);
        //Release Date
        $html = $client->getResponse()->getContent();
        $html = explode("</td>", explode("Released: </span>", $html)[1])[0];
        $date = trim($html);
        $date = DateTime::createFromFormat('F j, Y', $date);
        $metadata["released"] = $date->format("d-m-Y");
        //Rating
        $metadata["rating"] = 100;
        //Cast
        $crawler->filter("tr")->first()->filter("a")->each(function($node) {
            global $metadata, $client;
            $url = $node->attr("href");
            $actorpage = $client->request('GET', $url);
            $url = $actorpage->filter("#modelHeadshot > img")->attr("src");
            $url = str_replace("img/320w/","",$url);
            $url = cropHead($url);
            $name = $node->text();
            $member = array("image"=>$url, "name"=>$name, "role"=>"");
            array_push($metadata["cast"], $member);
        });
        //Images
        $i = 0;
        $z = 0;
        $crawler->filter("img[src*='https://cdn.helixstudios.com/img/300h/media/stills/']")->each(function($node) {
            global $metadata, $i, $z;
            $url = $node->attr("src");
            $url = str_replace("img/300h", "img/1920w", $url);
            if($i < 10) {
                if($z % 2 != 1) {
                    array_push($metadata["images"],"https://cdn.vigue.me/unsafe/" .  $url);
                }
            }
            $i++;
            $z++;
        });
        $i = 0;
        $z = 0;
        $crawler->filter("img[src*='https://cdn.helixstudios.com/img/300h/members/stills/']")->each(function($node) {
            global $metadata, $i, $z;
            $url = $node->attr("src");
            $url = str_replace("img/300h", "img/1920w", $url);
            if($i < 10) {
                if($z % 2 != 1) {
                    array_push($metadata["images"],"https://cdn.vigue.me/unsafe/" .  $url);
                }
            }
            $i++;
            $z++;
        });
    }
    //file_put_contents("saved/" . urlencode($query) . ".json", json_encode($metadata));
}

//CockyBoys
else if($studio == "CockyBoys") {
    $client = new \Goutte\Client();
    $results = googleSearch("site:cockyboys.com+$query");
    $url = $results[0];
    $crawler = $client->request('GET', $url);

    //DVD Title
    $metadata["title"] = $crawler->filter("h1.sectionTitle")->first()->text();
    //DVD Description
    $metadata["summary"] = $crawler->filter(".movieDesc")->text();
    $metadata["summary"] = str_replace("’","'", $metadata["summary"]);
    //Release Date
    $date = $crawler->filter(".underPlayer > div")->eq(1)->filter("p > span")->first()->text();
    $date = trim(explode(": ", $date)[1]);
    $date = DateTime::createFromFormat('d/m/Y', $date);
    $metadata["released"] = $date->format("d-m-Y");

    //Rating
    $metadata["rating"] = floatval($crawler->filter(".underPlayer > div")->first()->filter("p")->text()) * 10;
    //Cast
    $crawler->filter(".movieModels > span")->each(function($node) {
        global $metadata;
        $url = $node->filter("a")->last()->filter("img")->attr("src");
        $url = preg_replace('/([^:])(\/{2,})/', '$1/', $url);
        $url = cropHead($url);
        $name = $node->filter("a")->first()->text();
        $member = array("image"=>$url, "name"=>$name, "role"=>"");
        array_push($metadata["cast"], $member);
    });
    $url = str_replace("?type=vids","?type=highres",$url);
    $crawler = $client->request('GET', $url);
    $crawler->filter(".thumbs")->each(function($node) {
        global $metadata;
        $url = $node->attr("src");
        $url = str_replace("thumbs","1024watermarked",$url);
        $url = preg_replace('/([^:])(\/{2,})/', '$1/', $url);
        array_push($metadata["images"], "https://cdn.vigue.me/unsafe/" . $url);
    });
}

//Combined Search
else {
    $client = new \Goutte\Client();
    $query = str_replace("and","&",$query);
    $results = googleSearch("site:waybig.com+OR+site:fagalicious.com+OR+site:bananaguide.com+$query");

    $urlG = "";
    foreach($results as $result) {
        if($urlG == "") {
            $url = $result;
            if(strpos($url, "ServiceLogin") !== false || strpos($url, "goToSite") !== false || strpos($url, "ftp.waybig.com") !== false || strpos($url, "https://www.waybig.com/gallery/") !== false || strpos($url, "https://www.waybig.com/blog/tag/") !== false || strpos($url, "https://www.waybig.com/tag/") !== false || strpos($url, "https://www.waybig.com/video/") !== false || strpos($url, "https://www.waybig.com/pornstars/") !== false) {
            } else {
                $urlG = $url;
            }
        }
    }

    $crawler = $client->request('GET', $urlG);
    $metadata["title"] = $query;
    if(strpos($urlG, "waybig") !== false) {
        //Waybig
        //echo "waybig";
        $slug = explode("/blog/", $urlG)[1];
        $year = explode("/",$slug)[0];
        $month = explode("/",$slug)[1];
        $day = explode("/",$slug)[2];
        $metadata["released"] = $day . "-" . $month . "-" . $year;

        //Poster
        $poster = $crawler->filter("img[src*='zing.waybig.com']")->first()->attr("src");
        $manager = new ImageManager(array('driver' => 'gd'));
        $url = $poster;
        $arrContextOptions=array(
            "ssl"=>array(
                "verify_peer"=>false,
                "verify_peer_name"=>false,
            ),
        );  
        $response = file_get_contents($url, false, stream_context_create($arrContextOptions));
        $filename = "zing-" . uniqid() . ".jpg";
        file_put_contents($filename, $response);
        $image = $manager->make(imagecreatefromjpeg($filename));
        $width = $image->width();
        $height = $width * 1.5;
        $poster = "https://cdn.vigue.me/unsafe/0x0:" . $width . "x" . $height . "/" . $poster;
        array_push($metadata["images"], $poster);
        unlink($filename);

        //IAFD infos
        $query = urlencode($query);
        $url = "https://google.com/search?q=site:iafd.com+$query+inurl:title.rme";
        $goog = $client->request("GET", $url);
        $url = $goog->filter("a[href*='/url']")->first()->attr("href");
        $url = explode("/url?q=", $url)[1];
        $url = explode("&", $url)[0];
        $url = urldecode($url);

        $iafd = $client->request("GET", $url);
        try {
            $releaseDate = $iafd->filter(".biodata")->eq(8)->text();
            if($releaseDate != "No Data") {
                $date = DateTime::createFromFormat('M d, Y', $releaseDate);
                $metadata["released"] = $date->format("d-m-Y");
            }
        } catch(Exception $e) {

        }
        $synopsis = trim($iafd->filter("#synopsis > .padded-panel")->first()->text());
        $metadata["summary"] = stripDiac($synopsis);

        //Cast
        $nodes = $iafd->filter(".castbox");
        $nodes->each(function($node) {
            //echo "node";
            global $metadata;
            $name = trim($node->filter("a")->text());
            $src = $node->filter("img")->attr("src");
            $role = $node->text();
            $role = str_replace($name, "", $role);
            $role = trim($role);
            if(strpos($role, "Credited") !== false) {
                $role = explode(")", $role)[1];
            }
            $role = urlencode($role);
            $role = str_replace("%C2%A0", "", $role);
            $member = array("name" => $name, "image" => $src, "role" => $role);
            array_push($metadata["cast"], $member);
        });
    } else if(strpos($urlG, "bananaguide") !== false) {
        //Posters
        $nodes = $crawler->filter("a[rel='gallery-image']");
        $nodes->each(function($node) {
            global $metadata;
            $url = $node->attr("href");
            $url = "https://bananaguide.com" . $url;
            array_push($metadata["images"], "https://cdn.vigue.me/unsafe/" .  $url);
        });

        //IAFD infos
        $query = urlencode($query);
        $url = "https://google.com/search?q=site:iafd.com+$query+inurl:title.rme";
        $goog = $client->request("GET", $url);
        $url = $goog->filter("a[href*='/url']")->first()->attr("href");
        $url = explode("/url?q=", $url)[1];
        $url = explode("&", $url)[0];
        $url = urldecode($url);

        $iafd = $client->request("GET", $url);
        try {
            $releaseDate = $iafd->filter(".biodata")->eq(8)->text();
            if($releaseDate != "No Data") {
                $date = DateTime::createFromFormat('M d, Y', $releaseDate);
                $metadata["released"] = $date->format("d-m-Y");
            }
        } catch(Exception $e) {

        }
        $synopsis = trim($iafd->filter("#synopsis > .padded-panel")->first()->text());
        $metadata["summary"] = stripDiac($synopsis);

        //Cast
        $nodes = $iafd->filter(".castbox");
        $nodes->each(function($node) {
            //echo "node";
            global $metadata;
            $name = trim($node->filter("a")->text());
            $src = $node->filter("img")->attr("src");
            $role = $node->text();
            $role = str_replace($name, "", $role);
            $role = trim($role);
            if(strpos($role, "Credited") !== false) {
                $role = explode(")", $role)[1];
                $role = trim($role);
            }
            $role = urlencode($role);
            $role = str_replace("%C2%A0", "", $role);
            $member = array("name" => $name, "image" => $src, "role" => $role);
            array_push($metadata["cast"], $member);
        });
    } else{
        //Site Title
        $sTitle = stripDiac($crawler->filter(".entry-title")->first()->text());
        //Release Date
        $date_raw = trim($crawler->filter(".meta-date")->text());
        $date = DateTime::createFromFormat('F j, Y', $date_raw);
        $metadata["released"] = $date->format("d-m-Y");

        //Cast
        $tags = $crawler->filter(".post-meta > a[href*='/tag/']");
        $tags->each(function($node) {
            global $sTitle, $metadata;
            $tag = trim(stripDiac($node->text()));
            if(strpos(strtolower($sTitle), strtolower($tag)) !== false && strpos($tag, " ") !== false) {
                //echo $tag;
                //tag in title, assume cast member?

                $img = getIAFDActorImage($tag);
                if($img != "") {
                    $member = array("name"=>$tag, "image"=>$img);
                    array_push($metadata["cast"], $member);
                }

            } else {
                //genre!
                array_push($metadata["genres"], $tag);
            }
        });

        //Poster
        array_push($metadata["images"], "https://cdn.vigue.me/unsafe/" . $crawler->filter(".mypicsgallery")->first()->filter("a > img")->first()->attr("data-src"));

        //Summary
        $summaryNodes = $crawler->filter(".entry-content > p");
        $metadata["summary"] = "";
        $summaryNodes->each(function($node) {
            global $metadata;
            $metadata["summary"] .= $node->text();
        });
        $metadata["summary"] = stripDiac($metadata["summary"]);
    }
    //file_put_contents("saved/" . urlencode($query) . ".json", json_encode($metadata));
}

if($metadata["title"] != "") {
    $i = 0;
    file_put_contents("saved/" . urlencode(str_replace("&","and",$_GET['q'])) . ".json", json_encode($metadata));
    if($_GET['t'] == 1) {
        echo str_replace("\\","",json_encode($metadata));
    } else {
        header("Content-Type: application/json");
        echo json_encode($metadata);
    }
}

function minimum($b,$s) {
    if($b-$s < 0) {
        return 0;
    } else {
        return $b - $s;
    }
}
function getIAFDActorImage($name) {
    $name = strtolower($name);
    sleep(2);
    $q = "iafd " . $name . " inurl:person.rme";
    $url = "https://google.com/search?q=" . urlencode($q);
    $client = new \Goutte\Client();
    $goog = $client->request("GET", $url);
    $url = $goog->filter("a[href*='/url']")->first()->attr("href");
    $url = explode("/url?q=", $url)[1];
    $url = explode("&", $url)[0];
    $url = urldecode($url);
    $final = "";
    $genders = array("m","d");
    foreach ($genders as $gender) {
        try {
            $iafd = $client->request("GET", $url);
            $img = $iafd->filter("#headshot > img")->first()->attr("src");
            if(strpos($img, "nophoto340.jpg") !== false) {
                $final = "";
                break;
            }
            $final = $img;
        } catch(Exception $e) {
            $final = "";
            break;
        }
    }
    return $final;
}
function cropHead($url, $padding = 50) {
    $urls = 'https://neural.vigue.me/facebox/check';
    $data = array('url' => $url, 'faceprint' => 'false');

    // Setup cURL
    $ch = curl_init($urls);
    curl_setopt_array($ch, array(
        CURLOPT_POST => TRUE,
        CURLOPT_RETURNTRANSFER => TRUE,
        CURLOPT_HTTPHEADER => array(
            'Content-Type: application/json',
            'Accept: application/json'
        ),
        CURLOPT_POSTFIELDS => json_encode($data)
    ));

    // Send the request
    $result = curl_exec($ch);
    $json = json_decode($result);

    if($json->facesCount == 1) {
        $face = $json->faces[0]->rect;
        $crop = "";
        $crop .= minimum($face->left, $padding) . "x" . minimum($face->top, $padding) . ":";
        $crop .= (($face->left + $face->width) + $padding) . "x" . (($face->top + $face->height) + $padding);
        return "https://cdn.vigue.me/unsafe/" . $crop . "/" . $url;
    } else {
        return $url;
    }
}
function stripDiac($text) {
    $text = str_replace("”","\"", $text);
    $text = str_replace("“","\"", $text);
    $text = str_replace("‘","'", $text);
    $text = str_replace("’","'", $text);
    $text = str_replace("–","-", $text);
    $text = str_replace("&","and", $text);
    $text = str_replace("…","...", $text);
    return $text;
}
function posterOrBg($url) {
    $client = new \FasterImage\FasterImage();
    $images = $client->batch([
        $url
    ]);

    foreach ($images as $image) {
        //print_r($image['size']);
        if($image["size"][0] >= $image["size"][1]) {
            //wandscape
            return "bg";
        } else {
            return "poster";
        }
    }
}
function googleSearch($query) {
    $response = file_get_contents("https://vigue.me/api/googleSearch.php?q=" . urlencode($query));
    $results = json_decode($response);
    return $results;
}
?>
JPH71 commented 4 years ago

Wow... So does Plex allow other languages than python?

How do you call php from python?

On Sun, 22 Mar 2020, 14:49 Aiden Vigue, notifications@github.com wrote:

PHP script code for who is interested:

<?php require("../vendor/autoload.php"); if($_GET['t']) { ini_set('display_errors', 1); ini_set('display_startup_errors', 1); error_reporting(E_ALL); } use Symfony\Component\HttpClient\HttpClient; use Intervention\Image\ImageManager; header("Access-Control-Allow-Origin: *"); $query = $_GET['q']; $studio = $_GET['studio'];

if($studio == "") { $response = array("error" => 1); echo json_encode($response); }

$metadata = array("cast"=>array(),"images"=>array(), "bg"=>array(), "genres"=>array());

if(!isset($_GET['t']) && file_exists("saved/" . urlencode(str_replace("&","and",$query)) . ".json")) { header("Content-Type: application/json"); echo file_get_contents("saved/" . urlencode($query) . ".json"); die(); }

//STAXUS if($studio == "Staxus") { $client = new \Goutte\Client(); $urls = googleSearch("site:staxus.com+$query"); $url = $urls[0]; $crawler = $client->request('GET', $url);

//Video Title
$metadata["title"] = $crawler->filter("div[class='video-descr__title'] > div > div")->first()->text();
//Video Description
$metadata["summary"] = stripDiac($crawler->filter("div[class='video-descr__content']")->text());
//Release Date
$date = DateTime::createFromFormat('d/F/Y', explode(": ", trim($crawler->filter("div[class='video-details']")->first()->text()))[1]);
$metadata["released"] = $date->format("d-m-Y");

//Rating
$metadata["rating"] = floatval(trim($crawler->filter("span[class='video-grade-average'] > strong")->text())) * 20;

//Cast
$crawler->filter("div[class='video-descr__model-item']")->each(function($node) {
    global $metadata;
    $url = $node->filter("div[class='thumb']")->attr("style");
    $url = explode("url('", $url)[1];
    $url = explode("'", $url)[0];
    $url = "https:" . $url;
    $name = $node->filter("p > a")->text();
    $member = array("image"=>cropHead($url), "name"=>$name, "role"=>"");
    array_push($metadata["cast"], $member);
});
//Images
$crawler->filter("div[class='gallery-image col-md-4 col-sm-6 aspect-ratio'] > a")->each(function($node) {
    global $metadata;
    $url = $node->attr("style");
    $url = explode("url('", $url)[1];
    $url = explode("'", $url)[0];
    $url = str_replace("thumbs","1024watermarked", $url);
    $url = "https:" . $url;
    array_push($metadata["images"], "https://cdn.vigue.me/unsafe/" . $url);
});

}

//8TeenBoy else if($studio == "8TeenBoy") { $client = new \Goutte\Client(); $url = ""; $results = googleSearch("site:8teenboy.com+$query"); foreach($results as $result) { if($url == "") { if(strpos($result, "video/") !== false) { $url = $result; } } } //echo $url; $crawler = $client->request('GET', $url);

//Video Title
$metadata["title"] = $crawler->filter("h2.pull-left")->first()->text();
$metadata["title"] = stripDiac($metadata["title"]);
//Video Description
$metadata["summary"] = $crawler->filter("p[class='scene-description hide show-md']")->text();
$metadata["summary"] = str_replace("’","'", $metadata["summary"]);
//Release Date
$metadata["released"] = "";
$iafd = $client->request('GET', 'http://www.iafd.com/results.asp?searchtype=comprehensive&searchstring=' . urlencode($metadata["title"]));
$iafd->filter("#titleresult > tbody > tr")->each(function($node) {
    global $query, $metadata, $client;
    $title = $node->filter("td")->eq(3)->text();
    if($title == "") {
        $title = $node->filter("td")->eq(0)->text();
    }
    $distro = $node->filter("td")->eq(2)->text();
    $metadata["title"] = str_replace("and","&",$metadata["title"]);
    if(strtolower($title) == strtolower($metadata["title"]) && $distro == "helixstudios.net") {
        $href = "http://www.iafd.com/" . $node->filter(".pop-execute")->first()->attr("href");
        $iafd = $client->request('GET', $href);
        try {
            $releaseDate = $iafd->filter(".biodata")->eq(8)->text();
            if($releaseDate != "No Data") {
                $date = DateTime::createFromFormat('M d, Y', $releaseDate);
                $metadata["released"] = $date->format("d-m-Y");
            }
        } catch(Exception $e) {

        }
        $metadata["title"] = $iafd->filter("h1")->first()->text();

        $nodes = $iafd->filter(".castbox");
        $nodes->each(function($node) {
            //echo "node";
            global $metadata;
            $name = trim($node->filter("a")->text());
            $src = $node->filter("img")->attr("src");
            $role = $node->text();
            $role = str_replace($name, "", $role);
            $role = trim($role);
            $role = urlencode($role);
            $role = str_replace("%C2%A0", "", $role);
            $member = array("name" => $name, "image" => $src, "role" => $role);
            array_push($metadata["cast"], $member);
        });
        return;
    }
});
//Rating
$metadata["rating"] = 100;
//Cast
/*
$crawler->filter("div[class='thumbnail-grid thumbnail-grid pure-g']")->first()->filter("div > div > a")->each(function($node) {
    global $metadata;
    $url = $node->filter("div[class='thumbnail'] > img")->attr("src");
    $url = str_replace("img/200w/","",$url);
    $url = str_replace("https", "http", $url);
    $url = cropHead($url);
    $name = $node->filter("div[class='thumbnail-bottom-text']")->text();
    $member = array("image"=>$url, "name"=>$name);
    array_push($metadata["cast"], $member);
});
*/
//Images
$crawler->filter("img[src*='https://cdn.8teenboy.com/img/250h/media/stills/']")->each(function($node) {
    global $metadata;
    $url = $node->attr("src");
    $url = str_replace("img/250h/", "img/1920w/", $url);
    array_push($metadata["images"], "https://cdn.vigue.me/unsafe/" . $url);
});

}

//Helix Studios else if($studio == "Helix Studios") { $client = new \Goutte\Client(); $query = str_replace("and","&",$query); $url = ""; $results = googleSearch("site:helixstudios.net+$query"); foreach($results as $result) { if($url == "") { if(strpos($result, "video") !== false || strpos($result, "movie") !== false) { $url = $result; } } } if(isset($_GET['url'])) { $url = $_GET['url']; } //echo $url; $crawler = $client->request('GET', $url); if(strpos($url, "HXM") !== false) { //DVD Title $metadata["title"] = $crawler->filter("div.boxContent > h3")->first()->text(); //DVD Description $metadata["summary"] = $crawler->filter("p[class='description']")->text(); $metadata["summary"] = str_replace("’","'", $metadata["summary"]); //Release Date $date = $crawler->filter("div.boxContent > div")->first()->text(); $date = trim(explode(": ", $date)[1]); $date = DateTime::createFromFormat('F j, Y', $date); $metadata["released"] = $date->format("d-m-Y");

    //Rating
    $metadata["rating"] = 100;
    //Cast
    $crawler->filter("#scene-models > li")->each(function($node) {
        global $metadata;
        $url = $node->filter("a > img")->attr("src");
        $url = str_replace("/img/150w","",$url);
        $url = cropHead($url);
        $name = $node->filter("a > div")->text();
        $member = array("image"=>$url, "name"=>$name, "role"=>"");
        array_push($metadata["cast"], $member);
    });
    //Images
    $id = explode("/", explode("movie/", $url)[1])[0];
    array_push($metadata["images"], "https://cdn.vigue.me/unsafe/https://cdn.helixstudios.com/media/covers/".$id."_back_xlarge.1539280596.jpg");
    array_push($metadata["images"], "https://cdn.vigue.me/unsafe/https://cdn.helixstudios.com/media/covers/".$id."_front_xlarge.1539280596.jpg");
} else {
    //echo $url;
    //Video Title
    $metadata["title"] = $crawler->filter(".scene-title")->text();
    //Video Description
    $metadata["summary"] = $crawler->filter("tr")->eq(1)->text();
    $metadata["summary"] = str_replace("’","'", $metadata["summary"]);
    //Release Date
    $html = $client->getResponse()->getContent();
    $html = explode("</td>", explode("Released: </span>", $html)[1])[0];
    $date = trim($html);
    $date = DateTime::createFromFormat('F j, Y', $date);
    $metadata["released"] = $date->format("d-m-Y");
    //Rating
    $metadata["rating"] = 100;
    //Cast
    $crawler->filter("tr")->first()->filter("a")->each(function($node) {
        global $metadata, $client;
        $url = $node->attr("href");
        $actorpage = $client->request('GET', $url);
        $url = $actorpage->filter("#modelHeadshot > img")->attr("src");
        $url = str_replace("img/320w/","",$url);
        $url = cropHead($url);
        $name = $node->text();
        $member = array("image"=>$url, "name"=>$name, "role"=>"");
        array_push($metadata["cast"], $member);
    });
    //Images
    $i = 0;
    $z = 0;
    $crawler->filter("img[src*='https://cdn.helixstudios.com/img/300h/media/stills/']")->each(function($node) {
        global $metadata, $i, $z;
        $url = $node->attr("src");
        $url = str_replace("img/300h", "img/1920w", $url);
        if($i < 10) {
            if($z % 2 != 1) {
                array_push($metadata["images"],"https://cdn.vigue.me/unsafe/" .  $url);
            }
        }
        $i++;
        $z++;
    });
    $i = 0;
    $z = 0;
    $crawler->filter("img[src*='https://cdn.helixstudios.com/img/300h/members/stills/']")->each(function($node) {
        global $metadata, $i, $z;
        $url = $node->attr("src");
        $url = str_replace("img/300h", "img/1920w", $url);
        if($i < 10) {
            if($z % 2 != 1) {
                array_push($metadata["images"],"https://cdn.vigue.me/unsafe/" .  $url);
            }
        }
        $i++;
        $z++;
    });
}
//file_put_contents("saved/" . urlencode($query) . ".json", json_encode($metadata));

}

//CockyBoys else if($studio == "CockyBoys") { $client = new \Goutte\Client(); $results = googleSearch("site:cockyboys.com+$query"); $url = $results[0]; $crawler = $client->request('GET', $url);

//DVD Title
$metadata["title"] = $crawler->filter("h1.sectionTitle")->first()->text();
//DVD Description
$metadata["summary"] = $crawler->filter(".movieDesc")->text();
$metadata["summary"] = str_replace("’","'", $metadata["summary"]);
//Release Date
$date = $crawler->filter(".underPlayer > div")->eq(1)->filter("p > span")->first()->text();
$date = trim(explode(": ", $date)[1]);
$date = DateTime::createFromFormat('d/m/Y', $date);
$metadata["released"] = $date->format("d-m-Y");

//Rating
$metadata["rating"] = floatval($crawler->filter(".underPlayer > div")->first()->filter("p")->text()) * 10;
//Cast
$crawler->filter(".movieModels > span")->each(function($node) {
    global $metadata;
    $url = $node->filter("a")->last()->filter("img")->attr("src");
    $url = preg_replace('/([^:])(\/{2,})/', '$1/', $url);
    $url = cropHead($url);
    $name = $node->filter("a")->first()->text();
    $member = array("image"=>$url, "name"=>$name, "role"=>"");
    array_push($metadata["cast"], $member);
});
$url = str_replace("?type=vids","?type=highres",$url);
$crawler = $client->request('GET', $url);
$crawler->filter(".thumbs")->each(function($node) {
    global $metadata;
    $url = $node->attr("src");
    $url = str_replace("thumbs","1024watermarked",$url);
    $url = preg_replace('/([^:])(\/{2,})/', '$1/', $url);
    array_push($metadata["images"], "https://cdn.vigue.me/unsafe/" . $url);
});

}

//Combined Search else { $client = new \Goutte\Client(); $query = str_replace("and","&",$query); $results = googleSearch("site:waybig.com+OR+site:fagalicious.com+OR+site:bananaguide.com+$query");

$urlG = "";
foreach($results as $result) {
    if($urlG == "") {
        $url = $result;
        if(strpos($url, "ServiceLogin") !== false || strpos($url, "goToSite") !== false || strpos($url, "ftp.waybig.com") !== false || strpos($url, "https://www.waybig.com/gallery/") !== false || strpos($url, "https://www.waybig.com/blog/tag/") !== false || strpos($url, "https://www.waybig.com/tag/") !== false || strpos($url, "https://www.waybig.com/video/") !== false || strpos($url, "https://www.waybig.com/pornstars/") !== false) {
        } else {
            $urlG = $url;
        }
    }
}

$crawler = $client->request('GET', $urlG);
$metadata["title"] = $query;
if(strpos($urlG, "waybig") !== false) {
    //Waybig
    //echo "waybig";
    $slug = explode("/blog/", $urlG)[1];
    $year = explode("/",$slug)[0];
    $month = explode("/",$slug)[1];
    $day = explode("/",$slug)[2];
    $metadata["released"] = $day . "-" . $month . "-" . $year;

    //Poster
    $poster = $crawler->filter("img[src*='zing.waybig.com']")->first()->attr("src");
    $manager = new ImageManager(array('driver' => 'gd'));
    $url = $poster;
    $arrContextOptions=array(
        "ssl"=>array(
            "verify_peer"=>false,
            "verify_peer_name"=>false,
        ),
    );
    $response = file_get_contents($url, false, stream_context_create($arrContextOptions));
    $filename = "zing-" . uniqid() . ".jpg";
    file_put_contents($filename, $response);
    $image = $manager->make(imagecreatefromjpeg($filename));
    $width = $image->width();
    $height = $width * 1.5;
    $poster = "https://cdn.vigue.me/unsafe/0x0:" . $width . "x" . $height . "/" . $poster;
    array_push($metadata["images"], $poster);
    unlink($filename);

    //IAFD infos
    $query = urlencode($query);
    $url = "https://google.com/search?q=site:iafd.com+$query+inurl:title.rme";
    $goog = $client->request("GET", $url);
    $url = $goog->filter("a[href*='/url']")->first()->attr("href");
    $url = explode("/url?q=", $url)[1];
    $url = explode("&", $url)[0];
    $url = urldecode($url);

    $iafd = $client->request("GET", $url);
    try {
        $releaseDate = $iafd->filter(".biodata")->eq(8)->text();
        if($releaseDate != "No Data") {
            $date = DateTime::createFromFormat('M d, Y', $releaseDate);
            $metadata["released"] = $date->format("d-m-Y");
        }
    } catch(Exception $e) {

    }
    $synopsis = trim($iafd->filter("#synopsis > .padded-panel")->first()->text());
    $metadata["summary"] = stripDiac($synopsis);

    //Cast
    $nodes = $iafd->filter(".castbox");
    $nodes->each(function($node) {
        //echo "node";
        global $metadata;
        $name = trim($node->filter("a")->text());
        $src = $node->filter("img")->attr("src");
        $role = $node->text();
        $role = str_replace($name, "", $role);
        $role = trim($role);
        if(strpos($role, "Credited") !== false) {
            $role = explode(")", $role)[1];
        }
        $role = urlencode($role);
        $role = str_replace("%C2%A0", "", $role);
        $member = array("name" => $name, "image" => $src, "role" => $role);
        array_push($metadata["cast"], $member);
    });
} else if(strpos($urlG, "bananaguide") !== false) {
    //Posters
    $nodes = $crawler->filter("a[rel='gallery-image']");
    $nodes->each(function($node) {
        global $metadata;
        $url = $node->attr("href");
        $url = "https://bananaguide.com" . $url;
        array_push($metadata["images"], "https://cdn.vigue.me/unsafe/" .  $url);
    });

    //IAFD infos
    $query = urlencode($query);
    $url = "https://google.com/search?q=site:iafd.com+$query+inurl:title.rme";
    $goog = $client->request("GET", $url);
    $url = $goog->filter("a[href*='/url']")->first()->attr("href");
    $url = explode("/url?q=", $url)[1];
    $url = explode("&", $url)[0];
    $url = urldecode($url);

    $iafd = $client->request("GET", $url);
    try {
        $releaseDate = $iafd->filter(".biodata")->eq(8)->text();
        if($releaseDate != "No Data") {
            $date = DateTime::createFromFormat('M d, Y', $releaseDate);
            $metadata["released"] = $date->format("d-m-Y");
        }
    } catch(Exception $e) {

    }
    $synopsis = trim($iafd->filter("#synopsis > .padded-panel")->first()->text());
    $metadata["summary"] = stripDiac($synopsis);

    //Cast
    $nodes = $iafd->filter(".castbox");
    $nodes->each(function($node) {
        //echo "node";
        global $metadata;
        $name = trim($node->filter("a")->text());
        $src = $node->filter("img")->attr("src");
        $role = $node->text();
        $role = str_replace($name, "", $role);
        $role = trim($role);
        if(strpos($role, "Credited") !== false) {
            $role = explode(")", $role)[1];
            $role = trim($role);
        }
        $role = urlencode($role);
        $role = str_replace("%C2%A0", "", $role);
        $member = array("name" => $name, "image" => $src, "role" => $role);
        array_push($metadata["cast"], $member);
    });
} else{
    //Site Title
    $sTitle = stripDiac($crawler->filter(".entry-title")->first()->text());
    //Release Date
    $date_raw = trim($crawler->filter(".meta-date")->text());
    $date = DateTime::createFromFormat('F j, Y', $date_raw);
    $metadata["released"] = $date->format("d-m-Y");

    //Cast
    $tags = $crawler->filter(".post-meta > a[href*='/tag/']");
    $tags->each(function($node) {
        global $sTitle, $metadata;
        $tag = trim(stripDiac($node->text()));
        if(strpos(strtolower($sTitle), strtolower($tag)) !== false && strpos($tag, " ") !== false) {
            //echo $tag;
            //tag in title, assume cast member?

            $img = getIAFDActorImage($tag);
            if($img != "") {
                $member = array("name"=>$tag, "image"=>$img);
                array_push($metadata["cast"], $member);
            }
JPH71 commented 4 years ago

Also thinking if this idea is translatable to other search engines like being which may not be as likely to smack you over the head for using scripted searches....

On Sun, 22 Mar 2020, 15:40 Jason Hudson, jp.hudson@gmail.com wrote:

Wow... So does Plex allow other languages than python?

How do you call php from python?

On Sun, 22 Mar 2020, 14:49 Aiden Vigue, notifications@github.com wrote:

PHP script code for who is interested:

<?php require("../vendor/autoload.php"); if($_GET['t']) { ini_set('display_errors', 1); ini_set('display_startup_errors', 1); error_reporting(E_ALL); } use Symfony\Component\HttpClient\HttpClient; use Intervention\Image\ImageManager; header("Access-Control-Allow-Origin: *"); $query = $_GET['q']; $studio = $_GET['studio'];

if($studio == "") { $response = array("error" => 1); echo json_encode($response); }

$metadata = array("cast"=>array(),"images"=>array(), "bg"=>array(), "genres"=>array());

if(!isset($_GET['t']) && file_exists("saved/" . urlencode(str_replace("&","and",$query)) . ".json")) { header("Content-Type: application/json"); echo file_get_contents("saved/" . urlencode($query) . ".json"); die(); }

//STAXUS if($studio == "Staxus") { $client = new \Goutte\Client(); $urls = googleSearch("site:staxus.com+$query"); $url = $urls[0]; $crawler = $client->request('GET', $url);

//Video Title
$metadata["title"] = $crawler->filter("div[class='video-descr__title'] > div > div")->first()->text();
//Video Description
$metadata["summary"] = stripDiac($crawler->filter("div[class='video-descr__content']")->text());
//Release Date
$date = DateTime::createFromFormat('d/F/Y', explode(": ", trim($crawler->filter("div[class='video-details']")->first()->text()))[1]);
$metadata["released"] = $date->format("d-m-Y");

//Rating
$metadata["rating"] = floatval(trim($crawler->filter("span[class='video-grade-average'] > strong")->text())) * 20;

//Cast
$crawler->filter("div[class='video-descr__model-item']")->each(function($node) {
    global $metadata;
    $url = $node->filter("div[class='thumb']")->attr("style");
    $url = explode("url('", $url)[1];
    $url = explode("'", $url)[0];
    $url = "https:" . $url;
    $name = $node->filter("p > a")->text();
    $member = array("image"=>cropHead($url), "name"=>$name, "role"=>"");
    array_push($metadata["cast"], $member);
});
//Images
$crawler->filter("div[class='gallery-image col-md-4 col-sm-6 aspect-ratio'] > a")->each(function($node) {
    global $metadata;
    $url = $node->attr("style");
    $url = explode("url('", $url)[1];
    $url = explode("'", $url)[0];
    $url = str_replace("thumbs","1024watermarked", $url);
    $url = "https:" . $url;
    array_push($metadata["images"], "https://cdn.vigue.me/unsafe/" . $url);
});

}

//8TeenBoy else if($studio == "8TeenBoy") { $client = new \Goutte\Client(); $url = ""; $results = googleSearch("site:8teenboy.com+$query"); foreach($results as $result) { if($url == "") { if(strpos($result, "video/") !== false) { $url = $result; } } } //echo $url; $crawler = $client->request('GET', $url);

//Video Title
$metadata["title"] = $crawler->filter("h2.pull-left")->first()->text();
$metadata["title"] = stripDiac($metadata["title"]);
//Video Description
$metadata["summary"] = $crawler->filter("p[class='scene-description hide show-md']")->text();
$metadata["summary"] = str_replace("’","'", $metadata["summary"]);
//Release Date
$metadata["released"] = "";
$iafd = $client->request('GET', 'http://www.iafd.com/results.asp?searchtype=comprehensive&searchstring=' . urlencode($metadata["title"]));
$iafd->filter("#titleresult > tbody > tr")->each(function($node) {
    global $query, $metadata, $client;
    $title = $node->filter("td")->eq(3)->text();
    if($title == "") {
        $title = $node->filter("td")->eq(0)->text();
    }
    $distro = $node->filter("td")->eq(2)->text();
    $metadata["title"] = str_replace("and","&",$metadata["title"]);
    if(strtolower($title) == strtolower($metadata["title"]) && $distro == "helixstudios.net") {
        $href = "http://www.iafd.com/" . $node->filter(".pop-execute")->first()->attr("href");
        $iafd = $client->request('GET', $href);
        try {
            $releaseDate = $iafd->filter(".biodata")->eq(8)->text();
            if($releaseDate != "No Data") {
                $date = DateTime::createFromFormat('M d, Y', $releaseDate);
                $metadata["released"] = $date->format("d-m-Y");
            }
        } catch(Exception $e) {

        }
        $metadata["title"] = $iafd->filter("h1")->first()->text();

        $nodes = $iafd->filter(".castbox");
        $nodes->each(function($node) {
            //echo "node";
            global $metadata;
            $name = trim($node->filter("a")->text());
            $src = $node->filter("img")->attr("src");
            $role = $node->text();
            $role = str_replace($name, "", $role);
            $role = trim($role);
            $role = urlencode($role);
            $role = str_replace("%C2%A0", "", $role);
            $member = array("name" => $name, "image" => $src, "role" => $role);
            array_push($metadata["cast"], $member);
        });
        return;
    }
});
//Rating
$metadata["rating"] = 100;
//Cast
/*
$crawler->filter("div[class='thumbnail-grid thumbnail-grid pure-g']")->first()->filter("div > div > a")->each(function($node) {
    global $metadata;
    $url = $node->filter("div[class='thumbnail'] > img")->attr("src");
    $url = str_replace("img/200w/","",$url);
    $url = str_replace("https", "http", $url);
    $url = cropHead($url);
    $name = $node->filter("div[class='thumbnail-bottom-text']")->text();
    $member = array("image"=>$url, "name"=>$name);
    array_push($metadata["cast"], $member);
});
*/
//Images
$crawler->filter("img[src*='https://cdn.8teenboy.com/img/250h/media/stills/']")->each(function($node) {
    global $metadata;
    $url = $node->attr("src");
    $url = str_replace("img/250h/", "img/1920w/", $url);
    array_push($metadata["images"], "https://cdn.vigue.me/unsafe/" . $url);
});

}

//Helix Studios else if($studio == "Helix Studios") { $client = new \Goutte\Client(); $query = str_replace("and","&",$query); $url = ""; $results = googleSearch("site:helixstudios.net+$query"); foreach($results as $result) { if($url == "") { if(strpos($result, "video") !== false || strpos($result, "movie") !== false) { $url = $result; } } } if(isset($_GET['url'])) { $url = $_GET['url']; } //echo $url; $crawler = $client->request('GET', $url); if(strpos($url, "HXM") !== false) { //DVD Title $metadata["title"] = $crawler->filter("div.boxContent > h3")->first()->text(); //DVD Description $metadata["summary"] = $crawler->filter("p[class='description']")->text(); $metadata["summary"] = str_replace("’","'", $metadata["summary"]); //Release Date $date = $crawler->filter("div.boxContent > div")->first()->text(); $date = trim(explode(": ", $date)[1]); $date = DateTime::createFromFormat('F j, Y', $date); $metadata["released"] = $date->format("d-m-Y");

    //Rating
    $metadata["rating"] = 100;
    //Cast
    $crawler->filter("#scene-models > li")->each(function($node) {
        global $metadata;
        $url = $node->filter("a > img")->attr("src");
        $url = str_replace("/img/150w","",$url);
        $url = cropHead($url);
        $name = $node->filter("a > div")->text();
        $member = array("image"=>$url, "name"=>$name, "role"=>"");
        array_push($metadata["cast"], $member);
    });
    //Images
    $id = explode("/", explode("movie/", $url)[1])[0];
    array_push($metadata["images"], "https://cdn.vigue.me/unsafe/https://cdn.helixstudios.com/media/covers/".$id."_back_xlarge.1539280596.jpg");
    array_push($metadata["images"], "https://cdn.vigue.me/unsafe/https://cdn.helixstudios.com/media/covers/".$id."_front_xlarge.1539280596.jpg");
} else {
    //echo $url;
    //Video Title
    $metadata["title"] = $crawler->filter(".scene-title")->text();
    //Video Description
    $metadata["summary"] = $crawler->filter("tr")->eq(1)->text();
    $metadata["summary"] = str_replace("’","'", $metadata["summary"]);
    //Release Date
    $html = $client->getResponse()->getContent();
    $html = explode("</td>", explode("Released: </span>", $html)[1])[0];
    $date = trim($html);
    $date = DateTime::createFromFormat('F j, Y', $date);
    $metadata["released"] = $date->format("d-m-Y");
    //Rating
    $metadata["rating"] = 100;
    //Cast
    $crawler->filter("tr")->first()->filter("a")->each(function($node) {
        global $metadata, $client;
        $url = $node->attr("href");
        $actorpage = $client->request('GET', $url);
        $url = $actorpage->filter("#modelHeadshot > img")->attr("src");
        $url = str_replace("img/320w/","",$url);
        $url = cropHead($url);
        $name = $node->text();
        $member = array("image"=>$url, "name"=>$name, "role"=>"");
        array_push($metadata["cast"], $member);
    });
    //Images
    $i = 0;
    $z = 0;
    $crawler->filter("img[src*='https://cdn.helixstudios.com/img/300h/media/stills/']")->each(function($node) {
        global $metadata, $i, $z;
        $url = $node->attr("src");
        $url = str_replace("img/300h", "img/1920w", $url);
        if($i < 10) {
            if($z % 2 != 1) {
                array_push($metadata["images"],"https://cdn.vigue.me/unsafe/" .  $url);
            }
        }
        $i++;
        $z++;
    });
    $i = 0;
    $z = 0;
    $crawler->filter("img[src*='https://cdn.helixstudios.com/img/300h/members/stills/']")->each(function($node) {
        global $metadata, $i, $z;
        $url = $node->attr("src");
        $url = str_replace("img/300h", "img/1920w", $url);
        if($i < 10) {
            if($z % 2 != 1) {
                array_push($metadata["images"],"https://cdn.vigue.me/unsafe/" .  $url);
            }
        }
        $i++;
        $z++;
    });
}
//file_put_contents("saved/" . urlencode($query) . ".json", json_encode($metadata));

}

//CockyBoys else if($studio == "CockyBoys") { $client = new \Goutte\Client(); $results = googleSearch("site:cockyboys.com+$query"); $url = $results[0]; $crawler = $client->request('GET', $url);

//DVD Title
$metadata["title"] = $crawler->filter("h1.sectionTitle")->first()->text();
//DVD Description
$metadata["summary"] = $crawler->filter(".movieDesc")->text();
$metadata["summary"] = str_replace("’","'", $metadata["summary"]);
//Release Date
$date = $crawler->filter(".underPlayer > div")->eq(1)->filter("p > span")->first()->text();
$date = trim(explode(": ", $date)[1]);
$date = DateTime::createFromFormat('d/m/Y', $date);
$metadata["released"] = $date->format("d-m-Y");

//Rating
$metadata["rating"] = floatval($crawler->filter(".underPlayer > div")->first()->filter("p")->text()) * 10;
//Cast
$crawler->filter(".movieModels > span")->each(function($node) {
    global $metadata;
    $url = $node->filter("a")->last()->filter("img")->attr("src");
    $url = preg_replace('/([^:])(\/{2,})/', '$1/', $url);
    $url = cropHead($url);
    $name = $node->filter("a")->first()->text();
    $member = array("image"=>$url, "name"=>$name, "role"=>"");
    array_push($metadata["cast"], $member);
});
$url = str_replace("?type=vids","?type=highres",$url);
$crawler = $client->request('GET', $url);
$crawler->filter(".thumbs")->each(function($node) {
    global $metadata;
    $url = $node->attr("src");
    $url = str_replace("thumbs","1024watermarked",$url);
    $url = preg_replace('/([^:])(\/{2,})/', '$1/', $url);
    array_push($metadata["images"], "https://cdn.vigue.me/unsafe/" . $url);
});

}

//Combined Search else { $client = new \Goutte\Client(); $query = str_replace("and","&",$query); $results = googleSearch("site:waybig.com+OR+site:fagalicious.com+OR+site:bananaguide.com+$query");

$urlG = "";
foreach($results as $result) {
    if($urlG == "") {
        $url = $result;
        if(strpos($url, "ServiceLogin") !== false || strpos($url, "goToSite") !== false || strpos($url, "ftp.waybig.com") !== false || strpos($url, "https://www.waybig.com/gallery/") !== false || strpos($url, "https://www.waybig.com/blog/tag/") !== false || strpos($url, "https://www.waybig.com/tag/") !== false || strpos($url, "https://www.waybig.com/video/") !== false || strpos($url, "https://www.waybig.com/pornstars/") !== false) {
        } else {
            $urlG = $url;
        }
    }
}

$crawler = $client->request('GET', $urlG);
$metadata["title"] = $query;
if(strpos($urlG, "waybig") !== false) {
    //Waybig
    //echo "waybig";
    $slug = explode("/blog/", $urlG)[1];
    $year = explode("/",$slug)[0];
    $month = explode("/",$slug)[1];
    $day = explode("/",$slug)[2];
    $metadata["released"] = $day . "-" . $month . "-" . $year;

    //Poster
    $poster = $crawler->filter("img[src*='zing.waybig.com']")->first()->attr("src");
    $manager = new ImageManager(array('driver' => 'gd'));
    $url = $poster;
    $arrContextOptions=array(
        "ssl"=>array(
            "verify_peer"=>false,
            "verify_peer_name"=>false,
        ),
    );
    $response = file_get_contents($url, false, stream_context_create($arrContextOptions));
    $filename = "zing-" . uniqid() . ".jpg";
    file_put_contents($filename, $response);
    $image = $manager->make(imagecreatefromjpeg($filename));
    $width = $image->width();
    $height = $width * 1.5;
    $poster = "https://cdn.vigue.me/unsafe/0x0:" . $width . "x" . $height . "/" . $poster;
    array_push($metadata["images"], $poster);
    unlink($filename);

    //IAFD infos
    $query = urlencode($query);
    $url = "https://google.com/search?q=site:iafd.com+$query+inurl:title.rme";
    $goog = $client->request("GET", $url);
    $url = $goog->filter("a[href*='/url']")->first()->attr("href");
    $url = explode("/url?q=", $url)[1];
    $url = explode("&", $url)[0];
    $url = urldecode($url);

    $iafd = $client->request("GET", $url);
    try {
        $releaseDate = $iafd->filter(".biodata")->eq(8)->text();
        if($releaseDate != "No Data") {
            $date = DateTime::createFromFormat('M d, Y', $releaseDate);
            $metadata["released"] = $date->format("d-m-Y");
        }
    } catch(Exception $e) {

    }
    $synopsis = trim($iafd->filter("#synopsis > .padded-panel")->first()->text());
    $metadata["summary"] = stripDiac($synopsis);

    //Cast
    $nodes = $iafd->filter(".castbox");
    $nodes->each(function($node) {
        //echo "node";
        global $metadata;
        $name = trim($node->filter("a")->text());
        $src = $node->filter("img")->attr("src");
        $role = $node->text();
        $role = str_replace($name, "", $role);
        $role = trim($role);
        if(strpos($role, "Credited") !== false) {
            $role = explode(")", $role)[1];
        }
        $role = urlencode($role);
        $role = str_replace("%C2%A0", "", $role);
        $member = array("name" => $name, "image" => $src, "role" => $role);
        array_push($metadata["cast"], $member);
    });
} else if(strpos($urlG, "bananaguide") !== false) {
    //Posters
    $nodes = $crawler->filter("a[rel='gallery-image']");
    $nodes->each(function($node) {
        global $metadata;
        $url = $node->attr("href");
        $url = "https://bananaguide.com" . $url;
        array_push($metadata["images"], "https://cdn.vigue.me/unsafe/" .  $url);
    });

    //IAFD infos
    $query = urlencode($query);
    $url = "https://google.com/search?q=site:iafd.com+$query+inurl:title.rme";
    $goog = $client->request("GET", $url);
    $url = $goog->filter("a[href*='/url']")->first()->attr("href");
    $url = explode("/url?q=", $url)[1];
    $url = explode("&", $url)[0];
    $url = urldecode($url);

    $iafd = $client->request("GET", $url);
    try {
        $releaseDate = $iafd->filter(".biodata")->eq(8)->text();
        if($releaseDate != "No Data") {
            $date = DateTime::createFromFormat('M d, Y', $releaseDate);
            $metadata["released"] = $date->format("d-m-Y");
        }
    } catch(Exception $e) {

    }
    $synopsis = trim($iafd->filter("#synopsis > .padded-panel")->first()->text());
    $metadata["summary"] = stripDiac($synopsis);

    //Cast
    $nodes = $iafd->filter(".castbox");
    $nodes->each(function($node) {
        //echo "node";
        global $metadata;
        $name = trim($node->filter("a")->text());
        $src = $node->filter("img")->attr("src");
        $role = $node->text();
        $role = str_replace($name, "", $role);
        $role = trim($role);
        if(strpos($role, "Credited") !== false) {
            $role = explode(")", $role)[1];
            $role = trim($role);
        }
        $role = urlencode($role);
        $role = str_replace("%C2%A0", "", $role);
        $member = array("name" => $name, "image" => $src, "role" => $role);
        array_push($metadata["cast"], $member);
    });
} else{
    //Site Title
    $sTitle = stripDiac($crawler->filter(".entry-title")->first()->text());
    //Release Date
    $date_raw = trim($crawler->filter(".meta-date")->text());
    $date = DateTime::createFromFormat('F j, Y', $date_raw);
    $metadata["released"] = $date->format("d-m-Y");

    //Cast
    $tags = $crawler->filter(".post-meta > a[href*='/tag/']");
    $tags->each(function($node) {
        global $sTitle, $metadata;
        $tag = trim(stripDiac($node->text()));
        if(strpos(strtolower($sTitle), strtolower($tag)) !== false && strpos($tag, " ") !== false) {
            //echo $tag;
            //tag in title, assume cast member?

            $img = getIAFDActorImage($tag);
            if($img != "") {
                $member = array("name"=>$tag, "image"=>$img);
                array_push($metadata["cast"], $member);
            }
ghost commented 4 years ago

Magic.bundle makes a GET request to the magic-metadata.php file hosted on my site. That script calls googleSearch.php which uses a selenium grid instance virtualized in docker to scrape Google using an actual browser to not get blocked. All images are proxied through cdn.vigue.me to cache for 1 week to speed up future scrapes for the same title.

CodyBerenson commented 4 years ago

Terrific to see you back Aiden. Stay healthy everyone!

CodyBerenson commented 4 years ago

Hi Thanks for the innovative approach to Agent scraping.

I've tested against the seven most recent additions to WayBig, with varied results:

Matched, retrieves mostly accurate metadata (see below), retrieved posterart:

  1. (TimTales) - Tim Kruger and Santiago Rodriguez (2020) although posterart is correct, scene description, release date, etc. is mismatched
  2. (Corbin Fisher) - Ethan IV (2020) although posterart is correct, scene description, release date, etc. is mismatched
  3. (Fuckermate) - Abel Sanztin and Valdo Smith (2020) all metadata is accurate

Matched, retrieved mostly accurate metadata (release date is incorrect), retrieved NO posterart:

  1. (CockyBoys) - Calvin Banks and Nico Leon (2020)

No Match:

  1. (Active Duty) - Jesse Nice and Brandon Anderson (2020)
  2. (HotHouse) - Arad Winwin and Angel Rivera (2020)
  3. (Next Door Casting) - Rockey Goldenrod (2020) with or without spaces in studio name

Hope this helps.

Cody com.plexapp.agents.Magic.log

ghost commented 4 years ago

The No match ones are due to incomplete IAFD pages or mismatched results, I will add in checks and the sort to verify data before trying alternate sources. Thank you for the heads up!

I think the reason my library was scraped so well, was most of it was Helix & 8TeenBoy with only about 300-400 titles coming from WayBig.

Also, could you please take my name out of the comments, I'm converting this account to be a specific Github account for this project.

Thanks again!

CodyBerenson commented 4 years ago

@JPH71 please check in, let us know you and yours are doing ok.

ghost commented 4 years ago

Google has no search filter for what I am trying to accomplish. I might make it so that it first allows you to manually select a result to use when auto matching doesn't work. Sorry I forgot to respond, I am currently transcoding a 48TB RAID array from H.264 to HEVC to save space as it is full. Fun times

JPH71 commented 4 years ago

@CodyBerenson

I am still alive and breathing... Swamped with work as our team has being split in two and some are not able to work from home..

But tonight I am back to python... I have had enough of effing VBA

hugs and kisses

Jason

On Tue, 24 Mar 2020, 12:18 CodyBerenson, notifications@github.com wrote:

@JPH71 https://github.com/JPH71 please check in, let us know you and yours are doing ok.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/CodyBerenson/PGMA-Modernized/issues/1#issuecomment-603180761, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKI3AKLLSKOFVAPNQD3IZIDRJCJJ7ANCNFSM4LJOOPJQ .

j-ktz commented 4 years ago

@JPH71 It's been doing it for Waybig for me. Where can I find the newest agent where this is fixed?

JPH71 commented 4 years ago

@j-ktz - its being uploaded by Cody I sent it to him about a fortnight ago...

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/CodyBerenson/PGMA-Modernized/issues/1#issuecomment-604033627, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKI3AKMYZK72QP24DQW7W3TRJJJ2RANCNFSM4LJOOPJQ .

CodyBerenson commented 4 years ago

@j-ktz and @JPH71- the bundles that Jason posted above (now 10 days ago) were uploaded into the master immediately after all three were tested, the next day (now 9 days ago). Please archive your current bundles and retrieve the latest bundles from the code tab. The code tab always has the latest updated and tested bundles.

I would suggest to change the image cropping Thumbor preferences for each of the three scene bundles to:

https://cdn.vigue.me/unsafe

image

CodyBerenson commented 4 years ago

p.s. sorry if that sounded terse. it was NOT my intention. My intention is only to be helpful.

Cheers!