benjypng / logseq-recipeimporter-plugin

MIT License
7 stars 1 forks source link

Sites that do not work #1

Open richd67 opened 2 years ago

richd67 commented 2 years ago

Unfortunately all the sites I tried didnt work list below

benjypng commented 2 years ago

Hi! Are you on the latest version?

The first and last links actually worked for me, but not the second one.

richd67 commented 2 years ago

Just tried the first one again having just checked that it is the most up to date version and am still getting this error 'There was an error parsing the url, Please file a github issue with the url'

nicgentile commented 2 years ago

Hi @hkgnp I can confirm what @richd67 said. No sites worked for me at all. Something is broken. Might is on this build. on Debian.

Thanks.

benjypng commented 2 years ago

Thanks all. That is weird as it seems to be working on mine.

Importing recipes is a bit of a hit and miss because of the way websites structure their recipe content. They do not all do it the same way resulting in a multitude of permutations!

benjypng commented 2 years ago

Hi @hkgnp I can confirm what @richd67 said. No sites worked for me at all. Something is broken. Might is on this build. on Debian.

Thanks.

Do you have an example to share?

nicgentile commented 2 years ago

https://www.kingarthurbaking.com/recipes/chocolate-wafers-recipe https://www.seriouseats.com/choux-pastry https://www.averiecooks.com/black-white-cookies-cream-cheese-chocolate-chip-dark-chocolate-dark-brown-sugar/ https://lovelyjubley.com/vegan-salted-dark-chocolate-chilli-pots/ https://www.allrecipes.com/gallery/chicken-foil-packet-recipes/

Regardless of the site, none of them work, and with the little that I know about these things, the error is instant, not delayed, so the request itself might not even be happening in the first place.

benjypng commented 2 years ago

https://www.kingarthurbaking.com/recipes/chocolate-wafers-recipe https://www.seriouseats.com/choux-pastry https://www.averiecooks.com/black-white-cookies-cream-cheese-chocolate-chip-dark-chocolate-dark-brown-sugar/ https://lovelyjubley.com/vegan-salted-dark-chocolate-chilli-pots/ https://www.allrecipes.com/gallery/chicken-foil-packet-recipes/

Regardless of the site, none of them work, and with the little that I know about these things, the error is instant, not delayed, so the request itself might not even be happening in the first place.

Does the plugin even create a page after you paste the URL and press Enter or nothing happens?

nicgentile commented 2 years ago

Nothing happens. Literally, the immediate response is the error message, less than a microsecond after 'enter'. It's instantaneous, hence why I believe no request happens in the first place.

benjypng commented 2 years ago

Nothing happens. Literally, the immediate response is the error message, less than a microsecond after 'enter'. It's instantaneous, hence why I believe no request happens in the first place.

Hmm, that it means something else isn't working as it is supposed to. I tried the links you provided above and some of them do work so not sure what is happening on your side.

Anyway, I am in the process of trying to refactor the code to accommodate more recipe templates, although I must still caveat that it would not be 100% as the sites really have all sorts of permutations.

pushakargaikwad commented 2 years ago

https://shwetainthekitchen.com/creamy-vegetable-sandwich/ I think this site uses WordPress Recipe Maker Plugin.

mstine commented 2 years ago

Additional problem URLs:

Same error message got me to post these. Thanks!

JCGitH commented 2 years ago

Error received in Logsec states "Apologies, there was an error parsing the URL. Please file a Github issue with the URL." The URL was https://www.allrecipes.com/recipe/230660/orange-honey-and-soy-chicken/ 7/21/22

kensipe commented 2 years ago

I'm getting the failures as well. from dev tools console:

Access to XMLHttpRequest at 'https://www.seriouseats.com/choux-pastry' from origin 'lsp://logseq.io' has been blocked by CORS policy: No 'Access-Control-Allow-Origin' header is present on the requested resource.
index.e2686684.js:1 Error: Network Error
    at a.register.e.exports (index.e2686684.js:1:213808)
    at b.onerror (index.e2686684.js:1:212385)
index.e2686684.js:1          GET https://www.seriouseats.com/choux-pastry 

a google search top hit is another plugin with the issue which @hkgnp seemed to have some awareness of: https://github.com/hkgnp/logseq-mermaid-plugin/issues/4

uhuang commented 1 year ago

https://food52.com/recipes/62716-clear-wonton-broth https://www.seriouseats.com/homemade-wonton-soup-recipe

Neither of the above work for me. I was, however, able to get the Beef Curry link in the original post to work.

benjypng commented 1 year ago

Yeah, it is hard to parse recipes as there is no real standard out there that every website uses. I am still trying to find a better way to parse them than the one in the plugin now.

Apologies.

On Wed, Dec 21, 2022 at 12:28 PM uhuang @.***> wrote:

https://food52.com/recipes/62716-clear-wonton-broth https://www.seriouseats.com/homemade-wonton-soup-recipe

Neither of the above work for me. I was, however, able to get the Beef Curry link in the original post to work.

— Reply to this email directly, view it on GitHub https://github.com/hkgnp/logseq-recipeimporter-plugin/issues/1#issuecomment-1360842015, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALNXKGPJ7RZF7P3BPZM2GYTWOKBNNANCNFSM5SZUIHCQ . You are receiving this because you were mentioned.Message ID: @.***>

lukas-mertens commented 1 year ago

@hkgnp

https://cookidoo.de/recipes/recipe/de-DE/r282724

here is another one. To be honest: I would suggest adding a framework for scraping data from websites and letting the community contribute and maintain concrete implementations for certain websites. Keep the current "one-fits-all" algorithm and let users contribute custom ones, similar to how youtube-dl does it. Basically some abstract scraper class with concrete implementations where every implementation includes a regex at the beginning on which urls it should step in and do the scraping.

If you would do something like that, I can even imagine that we could write the scrapers almost automatically using ChatGPT/GPT3.0 by providing it with the abstract scraper class, an example implementation for one popular recipe site and the html of the page which it should scrape. That way we could contribute sites really fast and it would be scalable.

If one scraper works for multiple pages (e.g. because they are using the same CMS like wordpress, we would just have to extend the array of url-regexes).

I think this would greatly reduce the complexity of your code and would truly allow you to develop a one-fits-all solution.

The scraping part could even be a separate open source project to be honest, as I can believe, not only logseq users are interested in something like recipe-dl ^^ Basically youtube-dl but for recipes ;)

lukas-mertens commented 1 year ago

Actually I did some further research. It seems like this is a great blog-post:

https://www.benawad.com/scraping-recipe-websites/

Maybe that helps, especially the part about ld+json is what you could definitely implement.

pabloscloud commented 6 months ago

https://www.gaumenfreundin.de/rote-linsen-lasagne/

pabloscloud commented 6 months ago

https://biancazapatka.com/en/vegan-lasagna-with-lentils-and-zucchini/#recipe

shotcollin commented 4 months ago

None of these worked for me. I have plugin v1.0.2. https://www.allrecipes.com/recipe/221321/sazerac-cocktail/ https://www.liquor.com/recipes/sazerac/ https://www.thespruceeats.com/sazerac-cocktail-recipe-760604