Closed Tisila closed 5 years ago
I just realized it's only working for portuguese version, making changes...
Hey @Tisila , is it just working for portuguese because you hardcoded the .pt ending for the website, or are there other reasons as well?
Hey there @Simsal , It's just working for portuguese because I hardcoded the .pt in the initial link and in the recipeToFile function. I already started this change, it's almost complete.
@Simsal All the hardcodings around the locale have been resolved. Test it out and let me know if you have any issues. Happy scraping!
If both of you wish, I can add you as project developers.
@auino ,You already added me as a collaborator, that's the same thing, right?
Yes, I saw it later than writing the message (sorry).
I've tried the new version, but it doesn't work for macOS, even renaming the binary file referenced into the script to chromedriver
.
No problem ;-)
Have you downloaded the correct chromedriver
and do you have Chrome browser installed?
Although I don't know what kind of error message you're getting, I did a quick search and found out that the webdriver may have the wrong permissions. StackO link
Also getting following error. Also on MacOs :)
Also getting following error. Also on MacOs :)
This can be easily solved by installing selenium:
sudo pip install selenium
No problem ;-) Have you downloaded the correct
chromedriver
and do you have Chrome browser installed? Although I don't know what kind of error message you're getting, I did a quick search and found out that the webdriver may have the wrong permissions. StackO link
It works if the ./chromedriver
path is considered (I've added it as an optional parameter, in a temporary script). Nevertheless, the cookidoo.it
domain input is not working (returning a NameError: name 'it' is not defined
error).
I was not expecting that one...
The "pt" or "it" location is just added to the baseURL
I can't see what I did wrong.
I'll check it later on today.
It works if you enter it with single quotes -> 'de'
After Login and pressing enter the script terminates with following error
I can see that there are some inconsistencies between winodws and mac. I think I know how to solve it. hold on!
I tried two different terminals and it worked out ok. I made some changes that might solve the issue but it's not guaranteed. I don't have mac but a more universal solution would be to use a docker container, just a thought. I would happily create a docker file to get this up and running.
I've tried it and it works great. I believe that, in order to enhance it, single quoted input should removed. Also, the dockerization would be good. I'll now accept your pull request, hence make minor changes for multi-platform support (by adding input parameters to the script).
Found a way to remove single quotes, with raw_input()
(see https://stackoverflow.com/questions/37404134/in-python-is-there-anyway-to-input-a-string-without-quotation-marks).
I understand what this does and agree with the solution. I just have one silly question, are you running this with Python v2 or v3?
Python 2.7.10
There it is! Although I specified in the start of the file and in the first commit, I forgot to mention it here...
cookiscrap.py
is Python 3
I'm sorry for your inconvenience.
Well, the current version should work on v2 too.
Ok, let's make the raw_input change and keep it v2 hence it's a simple solution.
We could also dynamically detect the Python version (see this post on StackOverflow) and use the input
/raw_input
function accordingly.
You're right, it's the right way to have this working correctly. I thought that raw-input
worked in v3.
Started making those changes in new-parser
branch.
The input is working great!
You can delete new-scraping
if you wish. New developments will be made in new-parser
.
So far I've managed to implement a working recipe ID scraping for html downloads. It is aware of files inside the working directory and only downloads the files left.
The next step will be to create a markdown file with index based on cookidoo categories to search and open the recipes. Also would be interesting to do the same but based on user bookmarks.
Finally will create a html to markdown parser and convert markdown to pdf file.