datlechin / flarum-link-preview

MIT License
17 stars 5 forks source link

Link preview is changing my links #4

Closed orschiro closed 2 years ago

orschiro commented 2 years ago

Hi!

Thanks for your great extension.

However, it's changing my link https://app.sheetgo.com/account/usage into https://app.sheetgo.com.

Video: Screen recording (4).webm

datlechin commented 2 years ago

thanks for the report.

orschiro commented 2 years ago

You're welcome!

On Mon, 12 Sept 2022 at 14:44, Ngô Quốc Đạt @.***> wrote:

thanks for the report.

— Reply to this email directly, view it on GitHub https://github.com/datlechin/flarum-link-preview/issues/4#issuecomment-1243686773, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABJXQZ6ZJ6EU2PGABUIAL3V54QQ5ANCNFSM6AAAAAAQKMNDAI . You are receiving this because you authored the thread.Message ID: @.***>

datlechin commented 2 years ago

so weird, it didn't change any links for me

Screenshot 2022-09-13 at 15 35 13

orschiro commented 2 years ago

Thanks for testing as well!

Any idea why it's not working for us then?

See this example: https://community.sheetgo.com/d/275-testing-link-preview

And here is another video:

Screen recording (5).webm

spekulatius commented 2 years ago

Hey @orschiro,

I've built the scraping lib used by @datlechin's link-preview extension. I might have an idea: Does this page require to be signed in? If so, it might redirect and lead to the observed behavior.

Cheers, Peter

orschiro commented 2 years ago

Hi Peter,

Yes, it indeed does require people to be signed in.

How can we go about this issue?

On Sat, 17 Sept 2022 at 09:58, Peter Thaleikis @.***> wrote:

Hey @orschiro https://github.com/orschiro,

I've built the scraping lib https://github.com/spekulatius/PHPScraper used by @datlechin https://github.com/datlechin's link-preview extension. I might have an idea: Does this page require to be signed in? If so, it might redirect and lead to the observed behavior.

Cheers, Peter

— Reply to this email directly, view it on GitHub https://github.com/datlechin/flarum-link-preview/issues/4#issuecomment-1250024330, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABJXQ5HPVZ2J77O7KBYNNDV6V2YTANCNFSM6AAAAAAQKMNDAI . You are receiving this because you were mentioned.Message ID: @.***>

spekulatius commented 2 years ago

Hey @orschiro,

the scaper can really only access what is "available to access" for it. Any other scraper would have the same problem.

But there are options to get it working: for example, you could use a simple "mock page" with containing only the required data and return this page based on the agent. By default it contains phpscraper as an identifier in the user agent string. This would be transparent for users. Do you think this could work?

Cheers, Peter

orschiro commented 2 years ago

I'm all good with that. Final word is with @datlechin :)

On Sat, 17 Sep 2022, 12:46 Peter Thaleikis, @.***> wrote:

Hey @orschiro https://github.com/orschiro,

the scaper can really only access what is "available to access" for it. Any other scraper would have the same problem.

But there are options to get it working: for example, you could use a simple "mock page" with containing only the required data and return this page based on the agent. By default it contains phpscraper as an identifier in the user agent string. This would be transparent for users. Do you think this could work?

Cheers, Peter

— Reply to this email directly, view it on GitHub https://github.com/datlechin/flarum-link-preview/issues/4#issuecomment-1250048066, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABJXQ5W4CXWOVZ3EBNQPC3V6WOOZANCNFSM6AAAAAAQKMNDAI . You are receiving this because you were mentioned.Message ID: @.***>

datlechin commented 2 years ago

Hey @orschiro,

the scaper can really only access what is "available to access" for it. Any other scraper would have the same problem.

But there are options to get it working: for example, you could use a simple "mock page" with containing only the required data and return this page based on the agent. By default it contains phpscraper as an identifier in the user agent string. This would be transparent for users. Do you think this could work?

Cheers, Peter

May it works if I use userAgent based on client?

->setAgent($_SERVER['HTTP_USER_AGENT'])
spekulatius commented 2 years ago

The agent only helps if the accessed site changes the response from a redirect to a mock page as described. Can you create a simple page returning only the OG data, if the agent contains "phpscraper" @orschiro?

orschiro commented 2 years ago

Maybe the simplest solution would be always displaying the original link below the preview block so people have an alternative in case the preview breaks?

On Sat, 17 Sep 2022, 15:47 Robert, @.***> wrote:

Sorry but I'm not sure what you mean by simple page that only contains OG data.

On Sat, 17 Sep 2022, 13:59 Peter Thaleikis, @.***> wrote:

The agent only helps if the accessed site changes the response from a redirect to a mock page as described. Can you create a simple page returning only the OG data, if the agent contains "phpscraper" @orschiro https://github.com/orschiro?

— Reply to this email directly, view it on GitHub https://github.com/datlechin/flarum-link-preview/issues/4#issuecomment-1250058330, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABJXQY4OWJ4HOUP6XGL4ULV6WXCNANCNFSM6AAAAAAQKMNDAI . You are receiving this because you were mentioned.Message ID: @.***>

datlechin commented 2 years ago

@orschiro I haven't thought of this yet, hope this release works for you https://github.com/datlechin/flarum-link-preview/releases/tag/v1.0.1

orschiro commented 2 years ago

Wonderful, thanks! :)

On Sun, 18 Sep 2022, 09:27 Ngô Quốc Đạt, @.***> wrote:

Closed #4 https://github.com/datlechin/flarum-link-preview/issues/4 as completed.

— Reply to this email directly, view it on GitHub https://github.com/datlechin/flarum-link-preview/issues/4#event-7408541219, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABJXQYBPW4CSDVYBVPKYBTV6274FANCNFSM6AAAAAAQKMNDAI . You are receiving this because you were mentioned.Message ID: @.***>

orschiro commented 2 years ago

Sorry but I'm not sure what you mean by simple page that only contains OG data.

On Sat, 17 Sep 2022, 13:59 Peter Thaleikis, @.***> wrote:

The agent only helps if the accessed site changes the response from a redirect to a mock page as described. Can you create a simple page returning only the OG data, if the agent contains "phpscraper" @orschiro https://github.com/orschiro?

— Reply to this email directly, view it on GitHub https://github.com/datlechin/flarum-link-preview/issues/4#issuecomment-1250058330, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABJXQY4OWJ4HOUP6XGL4ULV6WXCNANCNFSM6AAAAAAQKMNDAI . You are receiving this because you were mentioned.Message ID: @.***>

spekulatius commented 2 years ago

Sorry but I'm not sure what you mean by simple page that only contains OG data.

I was thinking of exposing the desired values for the page that is usually behind the login. Here it would be og:site_name, og:title, og:description, og:image of https://app.sheetgo.com/account/usage. The redirect needs to be tweaked too to make it work. Did the earlier solution by @datlechin work for you?

orschiro commented 2 years ago

Yes, it was a good workaround! :)

On Tue, 11 Oct 2022 at 10:52, Peter Thaleikis @.***> wrote:

Sorry but I'm not sure what you mean by simple page that only contains OG data.

I was thinking of exposing the desired values for the page that is usually behind the login. Here it would be og:site_name, og:title, og:description, og:image of https://app.sheetgo.com/account/usage. The redirect needs to be tweaked too to make it work. Did the earlier solution by @datlechin https://github.com/datlechin work for you?

— Reply to this email directly, view it on GitHub https://github.com/datlechin/flarum-link-preview/issues/4#issuecomment-1274338774, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABJXQ3K33LXYOPFZD5FJGTWCUTEXANCNFSM6AAAAAAQKMNDAI . You are receiving this because you were mentioned.Message ID: @.***>