no longer working with flickr

next55777 commented 9 years ago

Seems like ripme is no longer able to download anything from Flickr. To reproduce please use first Flickr photostream that comes to your mind. Each time ripme returns "no images found at" error. I use the latest version of ripme and JavaPortableLauncher 3.0, which worked perfectly for me so far. Last time I used them was about 2 months ago and everything was ok.

drguildo commented 9 years ago

Flickr have changed their site to make it as difficult as possible to parse pages and do simple stuff like get a list of all the photos in a set. Everything is done via dynamic JavaShit calls. Check out their HTML, it's a disgusting mess.

The funny thing is they've included a HTML comment at the top of each page trying to solicit job applications, as if anyone would want to go and work for them after seeing their monstrosity.

next55777 commented 9 years ago

Flickr have changed their site

This is what I was afraid of...

EraYaN commented 9 years ago

How about the flickr API?

version365 commented 9 years ago

How about the flickr API?

try this app http://downloadair.ghusse.com/

it uses flickr api..

ingochris commented 9 years ago

Hello EraYaN,

API call method:

flickr.photos.getSizes

Source: https://www.flickr.com/services/api/flickr.photos.getSizes.html

bobobo1618 commented 8 years ago

I did this but it's messy, based on https://github.com/4pr0n/ripme/issues/286 and adds a dependency for OAuth: https://github.com/bobobo1618/ripme/commit/ce9b51d1b1f71dfb2e2a02074a52b5be89f924b3.

Supports galleries, profiles, albums and groups though.

Hrxn commented 8 years ago

So it is only working while authenticated, i.e. signed in with a Flickr account?

I think that is no big deal at all, legitimate requirement, IMHO.

bobobo1618 commented 8 years ago

Afaik, the options are:

Use the API with a valid set of credentials
- And a user-authenticated OAuth token (this gets you things like images only shared with you)
- Just with the API keys (this gets you access to public data)
Scrape the page, which as discussed above seems to be difficult.

But no, you don't need to be signed in if you're only downloading public data. You do still need API credentials though.

Hrxn commented 8 years ago

Okay, good to know.

But I think that is the right way to do it, use the API, provide key if necessary.

Hrxn commented 8 years ago

BTW, do you have by any chance a build of your fork somewhere @bobobo1618?

I can help to test it, if you want.

bobobo1618 commented 8 years ago

Sure. Here.

You'll need to set up the variables required by your ripper though.

Hrxn commented 8 years ago

Thanks!

You mean the variables set in rip.properties in %HOMEPATH% ?

The file looks like this normally (relevant part only):

# API creds
twitter.auth = VW9Ybjdjb1pkd2J0U3kwTUh2VXVnOm9GTzVQVzNqM29LQU1xVGhnS3pFZzhKbGVqbXU0c2lHQ3JrUFNNZm8=
tumblr.auth = v5kUqGQXUtmF7K0itri1DGtgTs0VQpbSEbh1jxYgj9d2Sq18F8
gw.api = gonewild

So I'd assume the syntax would be flickr.auth = <key> , right? :wink:

Hrxn commented 8 years ago

Okay, scratch that. rip.properties and history.json are in your current working dir, not %HOMEPATH%

Upon first launch, rip.properties was overwritten. Not a problem, no worries.

I guess this entry is also new:

storage.module = jets3t

At least it's not in the standard repo. "jets3t" is S3, according to Google.

What are the names of the other options here, @bobobo1618

bobobo1618 commented 8 years ago

It's in the code. You want "flickr.accessTokenSecret" and "flickr.accessTokenToken". You should set the storage module to "file". Jets3t shouldn't be set.

Hrxn commented 8 years ago

Well, I definitely did not set it to 'jets3t' manually.. Switched it to file, thanks!

It's in the code. You want "flickr.accessTokenSecret" and "flickr.accessTokenToken"

It doesn't work if set in rip.properties? Or did I misunderstand you?

bobobo1618 commented 8 years ago

I just gave you the jar I had sitting in my debug build folder so it might have some temporary debug code I was using or something.

Sorry, I was just pointing out that the names of the variables you want are in the commit. You should set them in rip.properties.

Hrxn commented 8 years ago

Ah, got it. Thanks for clearing that up!

Hrxn commented 8 years ago

Sorry @bobobo1618 for the delay, but I was kinda busy lately..

I set up

flickr.accessTokenToken
flickr.accessTokenSecret

and tried some flickr URLs, like:

https://www.flickr.com/photos/julianbialowas/ https://www.flickr.com/photos/julianbialowas/albums/72157623190416483

but I always get this error: Can't rip this URL: No compatible ripper found

Other rippers still work, I tried some tumblr sites just to be sure.

I guess the domain/host doesn't get recognized properly

I skimmed the FlickrAPIRipper.java looking for a clue, but I couldn't find anything so far.

Well, it does read flickr.signed from the config, but I tried that too and it did not help..

It has a canRip method inherited, though..

I did a search, found this: RipperInterface.java AbstractJSONRipper.java AbstractHTMLRipper.java AbstractRipper.java

Don't know if this is really useful. This is the parent repo, because, apparently

Sorry, forked repositories are not currently searchable.

bobobo1618 commented 8 years ago

Ah, sorry. I messed up reading my own code. The variables you want are apiKey and apiSecret. accessTokenToken and accessTokenSecret are generated by a complete OAuth flow.

You should leave signed as false (not in the config at all).

You should have seen an error like this in the console output as well: No Flickr API key or secret specified. I have no idea what happens if you're doing something crazy like using the GUI though.

Hrxn commented 8 years ago

Okay, fixed the config.

But I still get this:

D:\Inst\Ripme\ripme-bo>C:\RT\JRE\bin\java.exe -jar ripme.jar -u https://www.flickr.com/photos/julianbialowas/
Loaded D:\Inst\Ripme\ripme-bo\rip.properties
Flickr got a Flickr URL but couldn't parse it.
[!] Error while ripping URL https://www.flickr.com/photos/julianbialowas/
java.lang.Exception: No compatible ripper found
        at com.rarchives.ripme.ripper.AbstractRipper.getRipper(AbstractRipper.java:305)
        at com.rarchives.ripme.App.rip(App.java:53)
        at com.rarchives.ripme.App.handleArguments(App.java:145)
        at com.rarchives.ripme.App.main(App.java:45)

D:\Inst\Ripme\ripme-bo>

If I just run java.exe -jar ripme.jar, the GUI still appears. Strange Java stuff...

bobobo1618 commented 8 years ago

Hmm, turns out it doesn't work with unsigned requests yet and the library I was using to handle it doesn't support them. I'll have to look into it some more. Should've tested that.

bobobo1618 commented 8 years ago

Okay, I lied. Apparently the library does support it. Here's a new test build.

Hrxn commented 8 years ago

Good job, making progress here!

Tried it with https://www.flickr.com/photos/evanatwood/, and got the images.

Kudos to you..

I didn't look into the different file names and sizes thing yet, apparently you can't view different image sizes on that Flickr profile without being logged in, sigh..

bobobo1618 commented 8 years ago

I had a look and it doesn't seem to matter at all whether I'm logged in, the maximum size returned by the API is L (1024x) either way. Are you "friends" or something on Flickr? Sometimes that gives you access to larger sizes or extra photos or something and for that of course you'll have to be logged in.

The login process is:

Add flick.apiKey, flickr.apiSecret
Set flickr.signed to true
Run the ripper on a Flickr URL
In the console, a URL will be spat out, go to that URL, authorize stuff with your account
Once complete, you'll get redirected to a page, there should be a parameter called verifier or oauth_verifier in the URL, copy that and paste it to flickr.verifier in the config file
Re-run the ripper with the new config entry
You should see a line saying Authenticated Flickr.
API requests will now be made using your personal account details

Hrxn commented 8 years ago

No, we're not 'friends' on Flickr, I just had not seen this before, but this is not really surprising considering the fine-grained access control you can set up in your Flickr account.

I followed the steps, and got that verifier. But it was not part of the URL or something, just white-on-black text in the center of the page?

But I assume it was the right verifier:

D:\Inst\Ripme\ripme-bo>C:\RT\JRE\bin\java.exe -jar ripme.jar -u "https://www.flickr.com/photos/evanatwood/"
Loaded D:\Inst\Ripme\ripme-bo\rip.properties
Loaded log4j.properties
Initialized ripme v1.2.8
Got verifier!
Saved configuration to D:\Inst\Ripme\ripme-bo\rip.properties
Authenticated Flickr.
Determining URL type.
Found type with method flickr.urls.lookupUser
Retrieving https://www.flickr.com/photos/evanatwood/
Getting page 1
[and so on]

Thanks, kudos to you!

4pr0n / ripme

no longer working with flickr #226