nextcloud / cookbook

🍲 A library for all your recipes
https://apps.nextcloud.com/apps/cookbook
GNU Affero General Public License v3.0
522 stars 90 forks source link

Import with noscript: Unable to import Betty Bossy (CH) #770

Open vincegre opened 3 years ago

vincegre commented 3 years ago

Description Trying to import some recipes from website but not working !

Reproduction

  1. Go to website Betty Bossy to select a recipe for example that one: https://www.bettybossi.ch/fr/Rezept/ShowRezept/BB_SCSC160101_0136A-20-fr?WT.mc_id=snl_bettykocht_210719_f&utm_source=emarsys&utm_medium=email&utm_campaign=snl_bettykocht_210719_f&sc_src=email_9395578&sc_lid=469723034&sc_uid=6uojPGZIec&sc_llid=7715&sc_customer=27471380810&title=Nouilles+aux+haricots+verts+et+pesto
  2. Paste URL of recipe into Import URL field of Cookbook
  3. Click OK
  4. After end of process silent popup and no import done

Expected behavior Import properly the recipe as the website follows the schema.org for recipes (have validated by checking myself with online validator tool of schemas.org). Had success with other websites in the past.

Actual behavior Just display the empty popup (see below) and then you get back at Cookbook unchanged.

Screenshots Screenshot_20210720_165105

Browser Netscape 90.0 64bits in Ubuntu 20.04

Versions Nextcloud server version: 21.0.3 Cookbook version: 0.8.4 Database system: MySQL

Thanks

Vincèn

bfritscher commented 3 years ago

Betty Bossy protects its pages from "simple" get command issued from a command line / server without cookies. You can test it for example with https://www.view-page-source.com/. The real page source is only displayed after passing tests.

christianlupus commented 3 years ago

I'd say this is rather an extension to the current import parser: The original page contains a noscript tag which causes redirection to a static page that does indeed contain the relevant metadata in microdata format. For the named page that would be this URL and allowing for cookies.

The alternative would be the usage of a bookmarklet (#431) and using the client browser as a decoder.