WWBN / AVideo

Create Your Own Broadcast Network With AVideo Platform Open-Source. OAVP OVP
https://avideo.tube/AVideo_OpenSource
Other
1.91k stars 972 forks source link

URL image preview on Telegram and Whatsapp #5110

Closed guymass closed 3 years ago

guymass commented 3 years ago

We used to have all our site urls showing their post image when we shared them on Telegram, whatsupp and Matrix Element.

Now two our of three sites are not showing them anymore but only the URL, however on site does show them so I am not sure if this is due to some recent update you have made that might cause this issue.

If it were all three sites I would say its probably a Telegram thing but one site is showing those images when I share its vide urls on our Telegram groups.

Here is a screen shot of the three pasted URLs, the last one is showing the image.

Screen Shot 5781-08-28 at 0 48 11

DanielnetoDotCom commented 3 years ago

Tested on whats app and looks fine, test a demo site URL

image

guymass commented 3 years ago

So what is missing in my posts or what I should look for because when I paste my urls from live.ahava528.com they don't show the preview like this.

DanielnetoDotCom commented 3 years ago

maybe whatsapp/instagram could not access your files.

check if you are not blocking bots.

check your access log and see if their bots can reach your server

guymass commented 3 years ago

could be idk, and if there is a block, how would I know how to allow them access? I see in logs many bot access lines appear to show this message:

.php [16-May-2021 01:23:19 Asia/Jerusalem] AVideoLog::DEBUG: _json_encode: Error 1 Found: Malformed UTF-8 characters, possibly incorrectly encoded SCRIPT_NAME: /plugin/Chat2/getChatTotalNew.json.php [16-May-2021 01:23:19 Asia/Jerusalem] AVideoLog::DEBUG: _json_encode: Error 2 Found: Malformed UTF-8 characters, possibly incorrectly encoded SCRIPT_NAME: /plugin/Chat2/getChatTotalNew.json.php [16-May-2021 01:23:19 Asia/Jerusalem] AVideoLog::DEBUG: _json_encode: Error 3 Found: Malformed UTF-8 characters, possibly incorrectly encoded SCRIPT_NAME: /plugin/Chat2/getChatTotalNew.json.php [16-May-2021 01:23:19 Asia/Jerusalem] AVideoLog::DEBUG: Bot Detected, NOT showing the cache (/video/488/2-%D7%94%D7%94%D7%AA%D7%A2%D7%95%D7%A8%D7%A8%D7%95%D7%99%D7%95%D7%AA-%D7%A9%D7%9C%D7%99?channelName=RaisingTheFrequency&yptDeviceID=a0447ac2-9ac7-4061-8b32-5c526456d633) FROM: 185.191.171.6 Browser: Mozilla/5.0 (compatible; SemrushBot/7~bl; +http://www.semrush.com/bot.html) SCRIPT_NAME: /view/index.php [16-May-2021 01:23:19 Asia/Jerusalem] AVideoLog::DEBUG: Bot stopped SCRIPT_NAME: /view/index.php [16-May-2021 01:23:30 Asia/Jerusalem] AVideoLog::DEBUG: Bot Detected, NOT showing the cache (/video/358/%D7%9E%D7%98%D7%A2%D7%9F-%D7%97%D7%A9%D7%9E%D7%9C%D7%99-%D7%91%D7%9E%D7%99%D7%9D---%D7%9E%D7%99-%D7%91%D7%A8%D7%96-%D7%91%D7%9B%D7%A8%D7%9E%D7%99%D7%90%D7%9C-%D7%90%D7%97%D7%A8%D7%99-%D7%98%D7%99%D7%A4%D7%95%D7%9C---%D7%97%D7%9C%D7%A7-2?channelName=RaisingTheFrequency&vmap_id=0&yptDeviceID=f0c8141a-e091-4a05-84d6-012af3fbacfd) FROM: 185.191.171.4 Browser: Mozilla/5.0 (compatible; SemrushBot/7~bl; +http://www.semrush.com/bot.html) SCRIPT_NAME: /view/index.php [16-May-2021 01:23:30 Asia/Jerusalem] AVideoLog::DEBUG: Bot stopped SCRIPT_NAME: /view/index.php [16-May-2021 01:23:32 Asia/Jerusalem] AVideoLog::DEBUG: getPoster(1204, 2, ) SCRIPT_NAME: /plugin/Live/getImage.php [16-May-2021 01:23:32 Asia/Jerusalem] AVideoLog::DEBUG: Live::_getStats cached result 2 undefined getStats/live_servers_id2/undefined SCRIPT_NAME: /plugin/Live/getImage.php

matheusfillipe commented 3 years ago

@DanielnetoDotCom I am helping guy on that issue, his platform is blocking bots. We have many logs like:

[30-May-2021 00:15:48 Asia/Jerusalem] AVideoLog::DEBUG: Bot Detected, NOT showing the cache (/view/?v=695&page=61&rrating=ma&yptDevi
ceID=020880f7-7622-4cd1-abac-368bc06f610c&channelName=RaisingTheFrequency) FROM: 66.249.
64.183 Browser: Mozilla/5.0 (Linux; Android6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.97 Mobile Safari/537.36 (compatible; Googlebo/2.1; http://www.google.com/bot.html) SCRIPT_NAME: /view/index.php

For curl and for telegram requests. However in the videos/config.php I have set:

$global['stopBotsList'] = array();

I have also tried other combinations like:

$global['stopBotsList'] = array('rouwler','Nuclei','MegaIndex','NetSystemsResearch','CensysInspect','slurp','crawler','fetch','loader');
$global['stopBotsWhiteList'] = array('curl','google','bing','yahoo','yandex','Googlebot');

But it doesn't work. If I try to curl one page like:

https://live.ahava528.com/video/3551/%D7%97%D7%99%D7%A1%D7%95%D7%9C%D7%99%D7%9D-%D7%91%D7%A2-%D7%9D-%7C-%D7%90%D7%94%D7%91%D7%AA-%D7%A2%D7%95%D7%9C%D7%9D-%D7%90%D7%94%D7%91%D7%AA%D7%A0%D7%95---%D7%9E%D7%90%D7%99%D7%A8-%D7%A7%D7%93%D7%95%D7%A9-%D7%95%D7%A4%D7%A8%D7%97-%D7%A2%D7%A5-%D7%94%D7%97%D7%99%D7%99%D7%9D\?channelName\=WeActLive

It will give an empty response and if i add curl do the stopBotsList instead it does respond with Bot Found curl/7.76.1 but otherwise it is just empty. So it is not like this config is completely ignored but something else seems to be on the way.

DanielnetoDotCom commented 3 years ago

Make sure you have this unchecked

image

matheusfillipe commented 3 years ago

@DanielnetoDotCom That was indeed enabled. After disabling now curl respond with the html of the page. Telegram, other previews, still don't work and I still have the googlebot logs:

[31-May-2021 18:48:33 Asia/Jerusalem] AVideo
Log::DEBUG: Bot Detected, NOT showing the ca
che (/video/376/%D7%9E%D7%9B%D7%95%D7%A0%D7%
AA-%D7%9E%D7%99-%D7%94%D7%9B%D7%A1%D7%A3-%D7
%94%D7%97%D7%93%D7%A9%D7%94?page=14&type=all
&yptDeviceID=e68a0853-0206-46f5-8561-63e0b7c
5913d&channelName=RaisingTheFrequency) FROM:
 66.249.64.187 Browser: Mozilla/5.0 (Linux;
Android 6.0.1; Nexus 5X Build/MMB29P) AppleW
ebKit/537.36 (KHTML, like Gecko) Chrome/90.0
.4430.97 Mobile Safari/537.36 (compatible; G
ooglebot/2.1; +http://www.google.com/bot.htm
l) SCRIPT_NAME: /view/index.php
akhilleusuggo commented 3 years ago

@DanielnetoDotCom Same here daniel, and for a very log time. Specially everytime google access to images, google bot gets blocked, and redirected to 404 page.

guymass commented 3 years ago

OK I can confirm that today the link preview has come back, maybe due to last updates/fixes you made. I will keep this open for few more days just to make sure it still working on telegram, whatsapp and facebook.

DanielnetoDotCom commented 3 years ago

I do not get it, on my side the demo site works on WhatsApp and Facebook, I could not test on telegram

matheusfillipe commented 3 years ago

With the config like this:

$global['stopBotsList'] = array('rouwler','Nuclei','MegaIndex','NetSystemsResearch','CensysInspect','slurp','crawler','fetch','loader');
$global['stopBotsWhiteList'] = array('curl','google','bing','yahoo','yandex','Googlebot');

I can curl and preview from telegram with stopBotsFromNoCachePages disabled. I am still getting that googlebot logs though.

DanielnetoDotCom commented 3 years ago

sorry but maybe is not clear to me, do you want to stop or allow Googlebot?

If you want to block you must add Googlebot on the stopBotsList. Currently, Googlebot is whitelisted

Just to be clear Bot Detected, NOT showing the cache is not an error and does not mean this is blocking the bot. it is just warning you there is a bot consuming your resources as long as it is not showing the cache

matheusfillipe commented 3 years ago

@DanielnetoDotCom Yeah I wanted to allow it. Oh so that is just a warning. I thought it was blocking the bot. Well in that case it is fine then. Yes I want to allow Googlebot.

I think this issue can be closed @guymass. The previews are working.

akhilleusuggo commented 3 years ago

Just to be clear Bot Detected, NOT showing the cache is not an error and does not mean this is blocking the bot. it is just warning you there is a bot consuming your resources as long as it is not showing the cache

Not true ( following up with the logs ). The google bot get redirected to page 404

@DanielnetoDotCom Also there's something wrong with the log file. Error 404 not showing anymore on logs, and the page of 404 is not displaying properly.

image

DanielnetoDotCom commented 3 years ago

page 404 is not bot block, it is probably a real page not found.

we do not redirect bots, if is detect and blocked we just kill the script, and it should return a blank page HTTP code 200

akhilleusuggo commented 3 years ago

@DanielnetoDotCom can you fix the echo name "search" please ? Or could you tell where is the error 404 page script page. Some quote are missing , should show only ''search''

DanielnetoDotCom commented 3 years ago

You are right, here is the fix

guymass commented 3 years ago

This issue is fixed for me, thank you very much, I will close this now.