fake-name / wlnupdates

It's a WEBSITE! Whooooo!
http://wlnupdates.com
30 stars 6 forks source link

Public Facing API #3

Closed nlvw closed 6 years ago

nlvw commented 8 years ago

is there any chance of you implementing a public facing api for database calls? I'm working on a Android app to track and read web/light novels and it would be great to be able to have Sources like manga readers have available.

fake-name commented 8 years ago

I have a pretty simple API used internally for managing things like reading lists. It talks JSON. The relevant files are here:

The endpoint is POST requests to /api ( https://github.com/fake-name/wlnupdates/blob/master/app/apiview.py#L26-L49 ).

Most of the calls are here:

While it's not really documented (and I can fix that, just no one has asked), I have zero issue with people implementing things to interact with it. It's fairly robust and does a lot of input validation.

There is currently no API for getting series information, though the HTML structure was written with trivial parseability in mind (each info div is uniquely named, structure is fairly flat, etc...).

I don't see any reason I couldn't implement a more thorough API, though to be honest, I'm not sure what something like that should look like, as I haven't put that kind of thing together before.

Ideas? I could pretty easily just provide a way to query by series IDs/name, and return a JSON structure of the page info.


Watching series is pretty straight forward. Basically, there's just one endpoint for list management and creating watches. Lists are created if they don't exist, and the only real caveat is that to remove an item from a list, you set it's list to the special case value '-0-0-0-0-0-0-0-no-list-0-0-0-0-0-0-0-0-' (I'm ignoring the watch boolean flag in the JSON request for... some reason. I can't remember why I did it that way).

Reading state updates are not stateful, each update completely overrides the existing value, so the client has to figure out the incrementing for chapter/volume/etc... Currently, the fragment is ceiling(x, 99), divided by 100, and summed into the chapter number, which is stored as a float. Breaking the fragment number out into a separate column is on my todo list.

All calls return a response status object, which indicates API call success, and if there was an error, typically contains the error info.

I'll bash more together later this evening.

fake-name commented 8 years ago

I've started putting together documentation here.

nlvw commented 8 years ago

Nice! I'll look over it more in detail later. If you want to make your own mobile app you can consider requiring all API calls to be authenticated. This would also let you limit apps that want to use your site as a source.

Just as a recommendation all public (aka non authenticated) api calls should be read only. That way your database can't be messed with.

For a simple use a search for series and then retrieve that series data is all that's needed.

A little more advance usage would return a list of popular series (with thumbnails), genre searches that return a list of series in a genre, and a list of latest updated series. The cheapest way to do this would be to pre-populate these lists so you can control the size and update frequency. That way your database doesn't get hit by large complex queries. You can also use these same lists to expand the features of your website.

I believe this is how manga website APIs generally work. On Jan 11, 2016 9:26 PM, "Connor Wolf" notifications@github.com wrote:

I've started putting together documentation here https://www.wlnupdates.com/api-docs.

— Reply to this email directly or view it on GitHub https://github.com/fake-name/wlnupdates/issues/3#issuecomment-170786076.

fake-name commented 8 years ago

The only non-read-only non-auth API call I have is the set-rating call, and that's because I want to allow non-authenticated ratings (they're IP-bound in that case).

It's also the only non-auth API call I have whatsoever, so far.

I don't have a lot of time during the week, I should be able to flesh out the API, and add most of the features you describe this weekend.


Realistically, I have no plans for ever writing my own mobile app. Android is horrible (Urrrgh, java), and iOs is even worse (Objective C is the inbred, retarded child of C and probably Cthulhu).

I've mostly just focused on making it decently responsive.

nlvw commented 8 years ago

Cool nice to hear! I'm still playing around with database schemes for my application so my progress is going a little slow. if I ever open source my code i'll make sure to send it your way.

Realistics using Android Studio isn't to bad.........however things tend to get needlessly complicated (probably java's fault).

fake-name commented 8 years ago

Here's the current HTTP endpoints I have that are relevant here:

/artist-id/<sid>/                      renderArtistId   GET, HEAD, OPTIONS  False   None
/artists/                              renderArtistTable    GET, HEAD, OPTIONS  False   None
/artists/<int:page>                    renderArtistTable    GET, HEAD, OPTIONS  False   None
/artists/<letter>/<int:page>           renderArtistTable    GET, HEAD, OPTIONS  False   None
/artists/<page>                        renderArtistTable    GET, HEAD, OPTIONS  False   None
/author-id/<sid>/                      renderAuthorId   GET, HEAD, OPTIONS  False   None
/author-id/<sid>/<int:page>            renderAuthorId   GET, HEAD, OPTIONS  False   None
/authors/                              renderAuthorTable    GET, HEAD, OPTIONS  False   None
/authors/<int:page>                    renderAuthorTable    GET, HEAD, OPTIONS  False   None
/authors/<letter>/<int:page>           renderAuthorTable    GET, HEAD, OPTIONS  False   None
/authors/<page>                        renderAuthorTable    GET, HEAD, OPTIONS  False   None
/cover-img/<cid>                       renderCoverImage GET, HEAD, OPTIONS  False   None
/feeds/                                renderFeedsTable GET, HEAD, OPTIONS  False   None
/feeds/<int:page>                      renderFeedsTable GET, HEAD, OPTIONS  False   None
/feeds/<page>                          renderFeedsTable GET, HEAD, OPTIONS  False   None
/feeds/source/<source>/                renderFeedsSourceTable   GET, HEAD, OPTIONS  False   None
/feeds/source/<source>/<int:page>      renderFeedsSourceTable   GET, HEAD, OPTIONS  False   None
/feeds/source/<source>/<page>          renderFeedsSourceTable   GET, HEAD, OPTIONS  False   None
/feeds/tag/<tag>/                      renderFeedsTagTable  GET, HEAD, OPTIONS  False   None
/feeds/tag/<tag>/<int:page>            renderFeedsTagTable  GET, HEAD, OPTIONS  False   None
/feeds/tag/<tag>/<page>                renderFeedsTagTable  GET, HEAD, OPTIONS  False   None
/genre-id/<sid>/                       renderGenreId    GET, HEAD, OPTIONS  False   None
/genre-id/<sid>/<int:page>             renderGenreId    GET, HEAD, OPTIONS  False   None
/genres/                               renderGenreTable GET, HEAD, OPTIONS  False   None
/genres/<int:page>                     renderGenreTable GET, HEAD, OPTIONS  False   None
/genres/<letter>/<int:page>            renderGenreTable GET, HEAD, OPTIONS  False   None
/genres/<page>                         renderGenreTable GET, HEAD, OPTIONS  False   None
/group-id/<sid>/                       renderGroupId    GET, HEAD, OPTIONS  False   None
/group-id/<sid>/<int:page>             renderGroupId    GET, HEAD, OPTIONS  False   None
/groups/                               renderGroupsTable    GET, HEAD, OPTIONS  False   None
/groups/<int:page>                     renderGroupsTable    GET, HEAD, OPTIONS  False   None
/groups/<letter>/<int:page>            renderGroupsTable    GET, HEAD, OPTIONS  False   None
/groups/<page>                         renderGroupsTable    GET, HEAD, OPTIONS  False   None
/oel-releases/                         renderOelReleasesTable   GET, HEAD, OPTIONS  False   None
/oel-releases/<int:page>               renderOelReleasesTable   GET, HEAD, OPTIONS  False   None
/oel-releases/<page>                   renderOelReleasesTable   GET, HEAD, OPTIONS  False   None
/oel-series/                           renderOelSeriesTable GET, HEAD, OPTIONS  False   None
/oel-series/<int:page>                 renderOelSeriesTable GET, HEAD, OPTIONS  False   None
/oel-series/<letter>/<int:page>        renderOelSeriesTable GET, HEAD, OPTIONS  False   None
/oel-series/<page>                     renderOelSeriesTable GET, HEAD, OPTIONS  False   None
/publisher-id/<sid>/                   renderPublisherId    GET, HEAD, OPTIONS  False   None
/publisher-id/<sid>/<int:page>         renderPublisherId    GET, HEAD, OPTIONS  False   None
/publishers/                           renderPublisherTable GET, HEAD, OPTIONS  False   None
/publishers/<int:page>                 renderPublisherTable GET, HEAD, OPTIONS  False   None
/publishers/<letter>/<int:page>        renderPublisherTable GET, HEAD, OPTIONS  False   None
/publishers/<page>                     renderPublisherTable GET, HEAD, OPTIONS  False   None
/releases/                             renderReleasesTable  GET, HEAD, OPTIONS  False   None
/releases/<int:page>                   renderReleasesTable  GET, HEAD, OPTIONS  False   None
/releases/<page>                       renderReleasesTable  GET, HEAD, OPTIONS  False   None
/search                                search   GET, HEAD, OPTIONS, POST    False   None
/series-id/<sid>/                      renderSeriesId   GET, HEAD, OPTIONS  False   None
/series-id/<sid>/edit-covers/          renderEditCovers GET, HEAD, OPTIONS  False   None
/series/                               renderSeriesTable    GET, HEAD, OPTIONS  False   None
/series/<int:page>                     renderSeriesTable    GET, HEAD, OPTIONS  False   None
/series/<letter>/<int:page>            renderSeriesTable    GET, HEAD, OPTIONS  False   None
/series/<page>                         renderSeriesTable    GET, HEAD, OPTIONS  False   None
/tag-id/<sid>/                         renderTagId  GET, HEAD, OPTIONS  False   None
/tag-id/<sid>/<int:page>               renderTagId  GET, HEAD, OPTIONS  False   None
/tags/                                 renderTagTable   GET, HEAD, OPTIONS  False   None
/tags/<int:page>                       renderTagTable   GET, HEAD, OPTIONS  False   None
/tags/<letter>/<int:page>              renderTagTable   GET, HEAD, OPTIONS  False   None
/tags/<page>                           renderTagTable   GET, HEAD, OPTIONS  False   None
/translated-releases/                  renderTranslatedReleasesTable    GET, HEAD, OPTIONS  False   None
/translated-releases/<int:page>        renderTranslatedReleasesTable    GET, HEAD, OPTIONS  False   None
/translated-releases/<page>            renderTranslatedReleasesTable    GET, HEAD, OPTIONS  False   None
/translated-series/                    renderTranslatedSeriesTable  GET, HEAD, OPTIONS  False   None
/translated-series/<int:page>          renderTranslatedSeriesTable  GET, HEAD, OPTIONS  False   None
/translated-series/<letter>/<int:page> renderTranslatedSeriesTable  GET, HEAD, OPTIONS  False   None
/translated-series/<page>              renderTranslatedSeriesTable  GET, HEAD, OPTIONS  False   None
/watches                               renderUserWatches    GET, HEAD, OPTIONS  False   None

Boiled down, you have 23 endpoints:

artist-id
artists
author-id
authors
cover-img
feeds
genre-id
genres
group-id
groups
oel-releases
oel-series
publisher-id
publishers
releases
search
series-id
series
tag-id
tags
translated-releases
translated-series
watches

First order of the day I think is to basically just replicate them via the API. That should allow me to share the underlying database queries and so forth.

fake-name commented 8 years ago

API modes implemented (and completely undocumented) so far:

'get-artists'              
'get-authors'              
'get-genres'               
'get-groups'               
'get-oel-releases'         
'get-oel-series'           
'get-publishers'           
'get-releases'             
'get-series'               
'get-tags'                 
'get-translated-releases'  
'get-translated-series'    
fake-name commented 8 years ago

Partial API (and partial docs) are live.

Docs are here, API stuff is at wlnupdates.com/api.

More docs and the rest of the planned calls TBD.

gmathi commented 6 years ago

Is this API still being worked on? I am building an app (currently in beta & Android only for now) https://play.google.com/store/apps/details?id=io.github.gmathi.novellibrary Some users have requested to integrate your website as part of the app. And I also like the clean look of your website. Right now I am scraping information and as your search only returns the name and url of the series, it doesn't really look good on the mobile. screenshot_1501186169

fake-name commented 6 years ago

It's been idle for a while, since the original person interested (Wolfereign) seems to have evaporated.

I'd be happy to finish it up if it'll actually be used.

gmathi commented 6 years ago

Yes sir! For now, I will be scraping your website, but I would like to use API since they are fast than scraping!

fake-name commented 6 years ago

Significant portions of the API should already be functional, if you want to have a poke:

Dispatching is here: https://github.com/fake-name/wlnupdates/blob/master/app/apiview.py

Handlers are here: https://github.com/fake-name/wlnupdates/blob/master/app/api_handlers_anon.py, https://github.com/fake-name/wlnupdates/blob/master/app/api_handlers.py.

The routes I have yet to implement are:

def get_group_id(data):
    assert "id" in data, "You must specify a id to query for."
    assert is_integer(data['id']), "The 'id' member must be an integer, or a string that can cleanly cast to one."
    a_id = int(data['id'])
    return getResponse(error=True, message="Not yet implemented")

def get_search(data):
    data = check_validate_range(data)
    return getResponse(error=True, message="Not yet implemented")

def get_watches(data):
    return getResponse(error=True, message="Not yet implemented")

def get_cover_img(data):
    data = check_validate_range(data)
    return getResponse(error=True, message="Not yet implemented")

def get_feeds(data):
    data = check_validate_range(data)
    return getResponse(error=True, message="Not yet implemented")

I'm not sure how interactive you want things to be, but the existing website watch list and read-to-chapter system for logged in users internally use the API for their functionality via javascript currently. The entire item-editing interface is similar.

It's probably also worth noting that I did spend some time trying to make the HTML layout easy to scrape (mostly, there are distinct classes for everything, even though they're not used for anything).

gmathi commented 6 years ago

Yeah, I already finished the scraping. It was easy, the only issue is that the I don't get novel images/rating on the search results page. Does the search api return that information?

fake-name commented 6 years ago

I haven't implemented that part yet.

Basically, the search facility doesn't actually touch the Series table, but rather operates against the AlternateNames table only (well, the advanced search is more complex, but still).

I'll have a look at seeing how expensive a joined load that includes the series table would be.

gmathi commented 6 years ago

Ok. Thanks for the quick responses. I will be scraping for now. Keep me posted on your analysis.

fake-name commented 6 years ago

Some stuff:

I still need to implement:

I spent most of the weekend distracted with silly other crap (machining and manga-scrapers), so I didn't have much time to put into this. Hopefully this week, worse case this coming weekend.

gmathi commented 6 years ago

Thanks for the update. I will try to give it a dry run today.

fake-name commented 6 years ago

Search API stuff is implemented, should be mostly documented.



See extended https://www.wlnupdates.com/api-docs for how it works.
gmathi commented 6 years ago

Cool! Just went through it! Seems pretty much returns what my scraping returns :) I wanted to know if the search api can provide me with series image? or do I need to call the get-series method too?

fake-name commented 6 years ago

You'd need to call get-series too.

Functionally, the API calls are pretty much executing the same queries that the web interface does (in fact, they share a lot of code). The primary reason to use the API is that it's more easily parsed and more structured.

gmathi commented 6 years ago

Hey, thanks for the update. I will be implementing it over the next week in my app. Will let you know how it went 👍

fake-name commented 6 years ago

Let me know if you have issues/annoyances. The current implementation is very much based only on some head-scratching and my assumptions of how people would want to use it. rather then any actual experience. There's probably a bunch of dumb decisions on my part in there somehow.

gmathi commented 6 years ago

Search is working fine. 💯 What are the params you need to send for the get-series other than "mode": "get-series" ?

fake-name commented 6 years ago

That should act exactly like the other get-nnn queries that return paginated responses. Sending only the mode will return the first page of results. the additional optional parameters are offset and prefix. See the "Paginated Responses" portion of the API docs.

gmathi commented 6 years ago

oh! then I am asking the wrong question I guess. How would query the api to get a series details, all the details like artist, authors, licensing, rating, images, and other metadata?

fake-name commented 6 years ago

get-series-id.

Because, well, I'm bad at naming things, I guess.

I don't have docs for it yet, but the entire handler is here. Principally, you pass {'mode' : 'get-series-id', 'id' : id-nnnn}, and it returns a giant json affair.

gmathi commented 6 years ago

Thanks! 👍

jiteshnemkul commented 6 years ago

@gmathi-Thats really handy app you made!! I really appreciate that app of yours!! well i'm a kind of noob developer and i'm trying to make the app of the same concept for my college project.Did you get the api for novelupdates for your app? @fake-name -And when will api of wlnupdates be completed?

fake-name commented 6 years ago

Uh, what?

WLNUpdates was pretty much a clean implementation, with no base. It's mildly inspired by http://www.mangaupdates.com/ (in that I use that for tracking manga I've read, and I have opinions about what works there and what doesn't), but everything else was written pretty much from just the flask documentation and thinking about what I wanted in a novel-tracking site.

The API isn't based on anything other then basic principles.

At this point, the API is pretty much feature complete, though I've been lazy about documenting things. I don't think there's anything that I have yet to expose. What specifically are you asking about?

gmathi commented 6 years ago

@fake-name already contributed so much on these APIs (and I am thankful for that!) @jiteshnemkul For your school project, you should use major APIs since there is a lot of support out there. For example, you can create a cloud drive which integrates Google Drive, One Drive(Microsoft), DropBox, Box. All of those services have APIs associated with them.

fake-name commented 6 years ago

@jiteshnemkul - Oh, derp, I completely missed that you had a @ at the beginning of the initial comment. DERP. I somehow read it as being entirely at me.

Gah.

Anyways, AFICT, novelupdates doesn't even have an API.

Kenji94 commented 6 years ago

@fake-name Olla, I'm planning to use your API too, just to let you know! Can you give me your mail so I can write you a private message ? (Didn't find it on the website)

fake-name commented 6 years ago

@Kenji94 - info <at> wlnupdates <dot> com should make it to me, I think.

What needs to be private, though? It's a public API.

Emmanuella-Aninye commented 5 years ago

Just wanted to inform you that I plan on using the Api to create a webnovel reading app. Thanks for making it because it's been quite helpful! 👍

fake-name commented 5 years ago

@Emmanuella-Aninye - Cool! Let me know if there's stuff I can add to make your life easier.

Emmanuella-Aninye commented 5 years ago

@fake-name Just wanted to know if you currently had a work around in place for the intermediary page some translators use. (I had planned on scraping to grab that info if there was no workaround)

fake-name commented 5 years ago

I haven't come up with a general-case workaround. I do do intermediary removal for sites like Qidian, but smaller sites are generally too ad-hoc to properly script, since they're generally hand-written per-release.

I do pull releases from novelupdates, so that does help.

Emmanuella-Aninye commented 5 years ago

@fake-name I seem to have overcalled the api when iterating through ids and am now getting the error to contact you sorry

fake-name commented 5 years ago

Oh, that was you? I just threw together some rate limiting last night because I was having some (apparently) actually unrelated issues.

If you're logged in, you won't (or at least shouldn't) hit any rate limits.

The error message went through a bunch of iterations as I was putting it together. It's currently API calls when not logged in are rate limited. Please either log in, or slow down. Complain at github.com/fake-name/wlnupdates/issues if this is a problem.

fake-name commented 5 years ago

I'm not sure your data-model, but it's probably unwise to rely on a snapshot of the site's descriptions, as they're continually synchronized with a number of external sources. It'd be better to either have something like a LFU/LRU cache of the api results, or just fetch them as needed.

If you really want everything, it'd probably be better for me to just give you a SQL dump of the series table. Let me know what works for you.

Emmanuella-Aninye commented 5 years ago

@fake-name Yeah I was trying to iterate through the series ids and store the series data in a db (started off with mysql but swapped to mongo). If i could get an sql dump of the series table that would be great as then I'd only need to call releases for the updates which would significantly reduce api calls.

fake-name commented 5 years ago

@Emmanuella-Aninye - wlndump.sql.xz.zip

In a zip, because github is dumb and doesn't allow xz attachments. Includes OEL stuff, because pg_dump doesn't do filtering (or it does, and I missed it).

It's a postgres db dump.

Emmanuella-Aninye commented 5 years ago

@fake-name Thanks! Theoretically speaking would I be able to use your pass user sign up/login from my app to your api for people to be considered as logged in?

fake-name commented 5 years ago

Probably? I don't see why not, barring the complexity of handling several login IDs/confirmation e-mails etc...

I hadn't thought about federated login.

Emmanuella-Aninye commented 5 years ago

@fake-name Do you send confirmation emails on your side? Also, I don't think I saw a way to handle a forgotten password on the site?

fake-name commented 5 years ago

Do you send confirmation emails on your side?

Yep. Via a gmail account, because it was easy.

Also, I don't think I saw a way to handle a forgotten password on the site?

The current way to handle forgotten passwords is to open a github issue, and I'll reset it manually. It's.... kind of horrible.

Mostly, it hasn't been needed.

Emmanuella-Aninye commented 5 years ago

@fake-name I'm still hitting that error when I try to make API calls.I've also tried making an account but no confirmation as of now.

fake-name commented 5 years ago

Huh, I wonder if I broke something

fake-name commented 5 years ago

Ok, yeah, google seems to have locked the account I was using.

Siiiigh. I really didn't want to have to set up a MTA.

Emmanuella-Aninye commented 5 years ago

@fake-name sorry for the inconvenience 😛😛