Open next55777 opened 9 years ago
Flickr have changed their site to make it as difficult as possible to parse pages and do simple stuff like get a list of all the photos in a set. Everything is done via dynamic JavaShit calls. Check out their HTML, it's a disgusting mess.
The funny thing is they've included a HTML comment at the top of each page trying to solicit job applications, as if anyone would want to go and work for them after seeing their monstrosity.
Flickr have changed their site
This is what I was afraid of...
How about the flickr API?
Hello EraYaN,
API call method:
flickr.photos.getSizes
Source: https://www.flickr.com/services/api/flickr.photos.getSizes.html
I did this but it's messy, based on https://github.com/4pr0n/ripme/issues/286 and adds a dependency for OAuth: https://github.com/bobobo1618/ripme/commit/ce9b51d1b1f71dfb2e2a02074a52b5be89f924b3.
Supports galleries, profiles, albums and groups though.
So it is only working while authenticated, i.e. signed in with a Flickr account?
I think that is no big deal at all, legitimate requirement, IMHO.
Afaik, the options are:
But no, you don't need to be signed in if you're only downloading public data. You do still need API credentials though.
Okay, good to know.
But I think that is the right way to do it, use the API, provide key if necessary.
BTW, do you have by any chance a build of your fork somewhere @bobobo1618?
I can help to test it, if you want.
Sure. Here.
You'll need to set up the variables required by your ripper though.
Thanks!
You mean the variables set in rip.properties
in %HOMEPATH% ?
The file looks like this normally (relevant part only):
# API creds
twitter.auth = VW9Ybjdjb1pkd2J0U3kwTUh2VXVnOm9GTzVQVzNqM29LQU1xVGhnS3pFZzhKbGVqbXU0c2lHQ3JrUFNNZm8=
tumblr.auth = v5kUqGQXUtmF7K0itri1DGtgTs0VQpbSEbh1jxYgj9d2Sq18F8
gw.api = gonewild
So I'd assume the syntax would be flickr.auth = <key>
, right? :wink:
Okay, scratch that. rip.properties
and history.json
are in your current working dir, not %HOMEPATH%
Upon first launch, rip.properties
was overwritten. Not a problem, no worries.
I guess this entry is also new:
storage.module = jets3t
At least it's not in the standard repo. "jets3t" is S3, according to Google.
What are the names of the other options here, @bobobo1618
It's in the code. You want "flickr.accessTokenSecret" and "flickr.accessTokenToken". You should set the storage module to "file". Jets3t shouldn't be set.
Well, I definitely did not set it to 'jets3t' manually..
Switched it to file
, thanks!
It's in the code. You want "flickr.accessTokenSecret" and "flickr.accessTokenToken"
It doesn't work if set in rip.properties
? Or did I misunderstand you?
I just gave you the jar I had sitting in my debug build folder so it might have some temporary debug code I was using or something.
Sorry, I was just pointing out that the names of the variables you want are in the commit. You should set them in rip.properties
.
Ah, got it. Thanks for clearing that up!
Sorry @bobobo1618 for the delay, but I was kinda busy lately..
I set up
flickr.accessTokenToken
flickr.accessTokenSecret
and tried some flickr URLs, like:
https://www.flickr.com/photos/julianbialowas/ https://www.flickr.com/photos/julianbialowas/albums/72157623190416483
but I always get this error:
Can't rip this URL: No compatible ripper found
Other rippers still work, I tried some tumblr sites just to be sure.
I guess the domain/host doesn't get recognized properly
I skimmed the FlickrAPIRipper.java looking for a clue, but I couldn't find anything so far.
Well, it does read flickr.signed
from the config, but I tried that too and it did not help..
It has a canRip method inherited, though..
I did a search, found this: RipperInterface.java AbstractJSONRipper.java AbstractHTMLRipper.java AbstractRipper.java
Don't know if this is really useful. This is the parent repo, because, apparently
Sorry, forked repositories are not currently searchable.
Ah, sorry. I messed up reading my own code. The variables you want are apiKey
and apiSecret
. accessTokenToken
and accessTokenSecret
are generated by a complete OAuth flow.
You should leave signed
as false (not in the config at all).
You should have seen an error like this in the console output as well: No Flickr API key or secret specified
. I have no idea what happens if you're doing something crazy like using the GUI though.
Okay, fixed the config.
But I still get this:
D:\Inst\Ripme\ripme-bo>C:\RT\JRE\bin\java.exe -jar ripme.jar -u https://www.flickr.com/photos/julianbialowas/
Loaded D:\Inst\Ripme\ripme-bo\rip.properties
Flickr got a Flickr URL but couldn't parse it.
[!] Error while ripping URL https://www.flickr.com/photos/julianbialowas/
java.lang.Exception: No compatible ripper found
at com.rarchives.ripme.ripper.AbstractRipper.getRipper(AbstractRipper.java:305)
at com.rarchives.ripme.App.rip(App.java:53)
at com.rarchives.ripme.App.handleArguments(App.java:145)
at com.rarchives.ripme.App.main(App.java:45)
D:\Inst\Ripme\ripme-bo>
If I just run java.exe -jar ripme.jar
, the GUI still appears. Strange Java stuff...
Hmm, turns out it doesn't work with unsigned requests yet and the library I was using to handle it doesn't support them. I'll have to look into it some more. Should've tested that.
Okay, I lied. Apparently the library does support it. Here's a new test build.
Good job, making progress here!
Tried it with https://www.flickr.com/photos/evanatwood/, and got the images.
Kudos to you..
I didn't look into the different file names and sizes thing yet, apparently you can't view different image sizes on that Flickr profile without being logged in, sigh..
I had a look and it doesn't seem to matter at all whether I'm logged in, the maximum size returned by the API is L (1024x) either way. Are you "friends" or something on Flickr? Sometimes that gives you access to larger sizes or extra photos or something and for that of course you'll have to be logged in.
The login process is:
flick.apiKey
, flickr.apiSecret
flickr.signed
to true
verifier
or oauth_verifier
in the URL, copy that and paste it to flickr.verifier
in the config fileAuthenticated Flickr.
No, we're not 'friends' on Flickr, I just had not seen this before, but this is not really surprising considering the fine-grained access control you can set up in your Flickr account.
I followed the steps, and got that verifier. But it was not part of the URL or something, just white-on-black text in the center of the page?
But I assume it was the right verifier:
D:\Inst\Ripme\ripme-bo>C:\RT\JRE\bin\java.exe -jar ripme.jar -u "https://www.flickr.com/photos/evanatwood/"
Loaded D:\Inst\Ripme\ripme-bo\rip.properties
Loaded log4j.properties
Initialized ripme v1.2.8
Got verifier!
Saved configuration to D:\Inst\Ripme\ripme-bo\rip.properties
Authenticated Flickr.
Determining URL type.
Found type with method flickr.urls.lookupUser
Retrieving https://www.flickr.com/photos/evanatwood/
Getting page 1
[and so on]
Thanks, kudos to you!
Seems like ripme is no longer able to download anything from Flickr. To reproduce please use first Flickr photostream that comes to your mind. Each time ripme returns "no images found at" error. I use the latest version of ripme and JavaPortableLauncher 3.0, which worked perfectly for me so far. Last time I used them was about 2 months ago and everything was ok.