Closed THuffam closed 4 years ago
Hm, strange... could you post or provide the csv file you use?
yep - it's just a simple one line (class) to test:
searchterm,exclude
surfer aerial view,aerial view
Same file worked on my windows pc. Note I created the file separately on each machine (not copied the file across from one to the other - so no wierd character differences).
I've just checked and for the last run (the last error above) it did seem to work, in that it created the output folder and appears to have created the resized images - but it looks like some images are missing and it also did not created the .raw output folder.
I'm not sure, but I have the suspicion that the underlying icrawler has issues with the current state of the google API. There is a closed issue https://github.com/hellock/icrawler/pull/68, but no new version 0.6.3 yet as far as I can see...
I guess I really should to get going and finally implement the test suite I planned adding... 🤷♂️
I'll try to trace what's going on in the next days... Sorry for that
@THuffam ,
would you mind testing only non-Google crawl options for the moment?
fcd -c BING -k -o surfers surfers.csv
This worked for me just now... I'll try to figure out what's going on with GOOGLE (which is also included in ALL) in the meantime...
Hey thanks so much for your help with this. I have just kicked off the command you asked for. I'm quite happy to just use the output from the windows machine - but thought I should raise the issue. That said I'm very impressed with your software and can see me using it more, so it would be great to get it working on ubuntu. I wonder if it's something to do with my enviroment - its a newly installed OS (just created a dual boot 2 days ago) - so happy to make any change if you need me to. I'm based in Australia - so out time differences will be a bit out of whack. I'll post the results tomorrow morning. Thanks again Cheers Tim
I am having the same issue when using Google. Bing is fine
icrawler.builtin.GoogleImageCrawler
is not returning any results. Google must have changed the format of its image search
aha..
The issue of IsADirectoryError: [Errno 21] Is a directory: '/tmp/tmp2fv9jud7/surfer
is caused by an empty line in the query file - when run on linux (Ubuntu v19.10) - whereas windows and bash on windows can handle it. I noticed that your example file (guitars.csv) also has a blank line.
I noticed in the output that it looks like it is trying to do 2 searches even though there is just one query in the file...see Searching: >> <<
in the output above.
Also, while testing all of this, I rebooted and created 2 new conda environments one with python 3.7.6 and the other 3.6.10 - both worked using -c BING
. So the original error did not occur either.
Hope this helps Let me know if there is anything else I can do to help. Cheers Tim
Hi! I'm having the same issue of 'NoneType' object is not iterable
when searching in Google, but it works fine on BAIDU or BING.
I'm using python 3.8.2 on a Windows 10 machine and I'm tested with my own csv and the guitars.csv.
Hi 👋
Yes, that’s unfortunately currently a bug with the underlying icrawler package and the changes in googles API. There is an unreleased patch discussed in icrawler that am investigating at the moment.
I’m a bit swamped with real live work atm but try to look into how we could fix this...
Hi all :wave:
I just added a GoogleCrawler hotfix. Could anyone that has issues with fcd using the GOOGLE option try again and see if the problem is resolved?
Cheers
Sure .. just tried to update it - but not sure of the command to use - have tried pip install git+https://github.com/cwerner/fastclass.git#egg=fastclass
But when I ran fcd it gave the same error (TypeError: 'NoneType' object is not iterable)
I'm assuming it has not updated to use your new code. What command should I use to update instead? Thanks tim
Hi @thuffam 👋
Could you try to add --upgrade
to the pip command?
Works great - thanks for the update! One difference though... I couldn't find any of the original images (which I wanted to keep) - where these deleted or are they now located somewhere else?
Thanks again
@THuffam, great to hear!
Hm, I swapped the path handling from os.path to pathlib... However, this shouldn't alter the locations... Can you explain a bit more what you are doing exactly? and maybe start in a clean folder?
Where does pathlib point to?
I use the following command:
fcd -c GOOGLE -s 224 surfers.csv
The file surfers.csv contains these 2 lines:
searchterm,exclude
surfer aerial view, aerial view
I have just created another folder to retest this and copied my .csv file into it an re-ran the command. It produced the same results... it created a folder called dataset and within this a file called surfer.log and a folder called surfer which contains all the resized images. But I cannot see any of the original images anywhere (not to say they are not somewhere else on my drive).
@THuffam Thanks for checking 👍.
Seems I did introduce a bug in the process. Really should add some proper tests... I'm at work atm and need to finish some stuff but will have a look at it tonight the latest. Should be an easy fix...
Ahm @thuffam, did you by any chance forget to add the --keep / -k
flag? 😉
See fcd --help
?
fcd -k -c GOOGLE -s 224 surfers.csv
This should create a datasets.raw
folder with the original images...
Oh - yes - fail. Sorry about that. Tried again with the following command:
fcd -k -c GOOGLE -s 224 surfers.csv
And this time it did keep the original files (in the dataset.raw folder).
Also tested it with the -c ALL option and it worked - although that only used Google and Bing - not Baidu which I've seen in other examples.
Thanks
Ok. Great. Closing this now 👍
Hi Just installed FastClass as per instructions on your blog post
Created a simple query file and ran the command as per your blog post but got the following error:
I'm running Ubuntu 19.10 using conda environment with Python 3.6. I also tried a new install of FastClass in a new conda environment with python 3.7 and got the following error:
However it did run fine when run from a conda environment with python 3.7 on my windows 10 box.
Any suggestions? Thanks, and thanks so much for developing this app! Kind regards Tim