subutux / rmapy

A unofficial python module for interacting with the Remarkable Cloud
http://rmapy.readthedocs.io/
MIT License
123 stars 47 forks source link

reMarkable rolling out a new API for cloud storage #25

Open sabidib opened 3 years ago

sabidib commented 3 years ago

According to https://github.com/juruen/rmapi/issues/187 it seems as though a new API for the remarkable cloud is being rolled out that uses GCS directly.

It seems to be rolling out slowly to different users over the last 2 weeks. I'm still not on it as my cloud seems to be syncing correctly using rmapy and rmapi.

@subutux Have you noticed any issues with your syncing?

If anyone is on the new API, I would gladly pair with them to work on figuring out the new API and get a PR merged into rmapy/rmapi.

AaronDavidSchneider commented 3 years ago

I wrote a mail to you. I would love to help.

subutux commented 3 years ago

Hi @sabidib

I don't actively use rmapy, however, i did a quick test and it seems that I'm unable to login, where my authentication requests are redirected to an https://doesnotexist.remarkable.com host:

MaxRetryError: HTTPSConnectionPool(host='doesnotexist.remarkable.com', port=443): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7fb9357ac4c0>: Failed to establish a new connection: [Errno -2] Name or service not known'))

anyone having the same experience?

subutux commented 3 years ago

found that issue: juruen/rmapi#177

subutux commented 3 years ago

After fixing the authentication issue, I was able to upload a sample pdf file and it was downloaded on my remarkable. So it seems that my remarkable is not (yet) using the new storage api. @sabidib & @AaronDavidSchneider do one of you have the new storage api?

If so, could one of you use mitmproxy (or some other proxy software) to perform the following actions in the desktop app?

  1. Open an existing document in the app
  2. change the page
  3. close the document
  4. upload the provided sample pdf document
  5. report the log of mitmproxy here, or privately to me (subutux@gmail.com)

Then, I'll try to implement the storage api.

subutux commented 3 years ago

FYI, I'm leaving on vacation in 6 hours for a week, so my response may be delayed.

AaronDavidSchneider commented 3 years ago

Started to work on it. But I still have a lot of open questions... The whole thing is very slow and weird.

sabidib commented 3 years ago

@subutux: @AaronDavidSchneider and I have been working through the API.

I wrote up a doc outlining what we know so far about the API: https://docs.google.com/document/d/1peZh79C2BThlp2AC3sITzinAQKJccQ1gn9ppdCIWLl8/edit

I have a branch that is based off of @AaronDavidSchneider 's PR that implements all the above functionality.

The PR is still missing a solid public API that is similar to the current rmapy public API, this is because of how loading .rm and .highlight files require 2 requests for each file.

It may be that we give the user to option to lazily download the .rm and .highlight files as they access them.

Observations

Given the current API, to get an actual data file for a given document takes at least 4 request and at most 8:

Request 1:
    Get the root file index GCS download URL
Request 2:
    Download the root file index from the returned URL.
Request 3:
    Get the root file GCS download url using the GCS path from the root file index
Request 4:
    Download the root file from the GCS download URL
Request 5:
    Get the index file GCS download URL using the GCS path from the root file.
Request 6:
    Download the index file from the GCS download URL.
Request 7:
    Get the data file GCS download URL using the GCS path from the index file.
Request 8:
    Download the data file

In the same session you may be able to omit getting the root file and index file for multiple data file retrievals.

This means best case for a remarkable with 100 documents we do:

8 + 4*99 = 404 requests

to get all the metadata.

Subsequent requests for a datafile only cost 2 requests. This could mean hundreds of requests to get all the .rm or .highlights files for a given document.

fishy commented 2 years ago

Hi,

Thanks for the great write-up on Google Doc! I was trying to follow it to implement the 1.5 API today, and got pretty far, but encountered two issues:

  1. (trivial): when I try to get https://service-manager-production-dot-remarkable-production.appspot.com/?environment=production&apiVer=1 I always get a 404 (so I used the hardcoded api host instead)
  2. (important) when uploading an epub file, all the previous uploads (.metadata, .content, .pagedata, .epub, index, and root index files) succeeded, but the last upload to update the root file index, I can get an signed GCS url from the upload api (the one with payload of {"generation": "1627493476159831", "http_method": "PUT", "relative_path": "root"}), but trying to upload to that GCS url always give me 400 error.

did anyone see the 400 problem when trying to update root file index?

(my wip go code is at https://github.com/fishy/url2epub/compare/main..api15, in case anyone is curious)

fishy commented 2 years ago

^ Oh I figured it out. There are a few things required for upload to work missed from the Google Doc.

Things I'm 100% sure that is required for upload to work:

  1. For the http PUT request that updates the final root file, you have to also have x-goog-if-generation-match header set, otherwise you'll get 400 response and fail to update root.

Things I also did and am not 100% sure whether they are required for upload to work:

  1. The GCS paths should be sha256 hash of the content for the upload
  2. The index files (both document index and root index) should be sorted by the GCS path

The final go commit to implement API 1.5 in my project is at https://github.com/fishy/url2epub/commit/72998a916bcd04de4b71654ce59072032797725c, in case anyone is curious.