pl-semiotics / rM-vnc-server

Damage-tracking VNC server for the reMarkable tablet
GNU General Public License v2.0
86 stars 7 forks source link

Support for system software version 2.6 #8

Open pks40 opened 3 years ago

pks40 commented 3 years ago

Apparently, the binaries need to be recompiled to work with the new reMarkable software version 2.6, since some shared libraries were updated to new versions in this update. (At least libcrypto.so was updated; I’m not sure about others.)

(Perhaps this is also an issue for nix-remarkable, but since I don’t really understand the build system, I’m not sure about this.)

bordaigorl commented 3 years ago

@pl-semiotics any news about this? I haven't got the update yet (and would not want to update until this is fixed as I use rmview daily for work) so I cannot test workarounds... For RM1, would compiling the server linking those libraries statically help? Maybe not efficient but would allow for the same executable to work on all versions? What about RM2, I think there's some patching involved and there things might be less obvious?

@pks40 would be nice to have a list of the updated libraries...(are you on RM1 or RM2?)

bbolker commented 3 years ago

Got the update, would be eager to test on both RM1 and RM2 if the instructions for applying patches/workarounds/updates are sufficiently easy for me to follow ...

Should I start by following the build instructions on the front page of this repo and then figuring out how to combine them with rmview (which is the only way in which I use the software)?

wmkoolen commented 3 years ago

Dear all,

The following works for me.

  1. while still on firmware 2.5, copy /usr/lib/libcrypto.so.1.0.2 from the remarkable to somewhere safe.
  2. upgrade to 2.6. This will introduce /usr/lib/libcrypto.so.1.1 and remove /usr/lib/libcrypto.so.1.0.2
  3. copy the stored libcrypto.so.1.0.2 from step 1 back onto the remarkable (as /usr/lib/libcrypto.so.1.0.2)

Now rM-vnc-server-standalone lauches and works, and hence so does rmview. This holds both on firmware versions 2.6.1.71 and 2.6.2.75.

juanfcocontreras commented 3 years ago

Hello @wmkoolen ,

Thanks for the information. Could you upload libcrypto.so.1.0.2? I've already upgraded my Remarkable, so I have no option to get this file anymore.

Thanks in advance!

bordaigorl commented 3 years ago

@juanfcocontreras Here it is (from RM1): libcrypto.so.1.0.2.zip

@wmkoolen Thank you so much for reporting this!

If anybody could post the library from RM2-pre2.6 @bbolker may test it out for us

bordaigorl commented 3 years ago

Here is libcrypto for RM2 (courtesy of Mattéo Delabre) libcrypto.so.1.0.2.zip @bbolker could you test it for us? Just unzip it and copy it over with (assuming connection via USB)

scp /path/of/uncompressed/libcrypto.so.1.0.2 root@10.11.99.1:/usr/lib/

Then start rmview and check if it works.

juanfcocontreras commented 3 years ago

Here is libcrypto for RM2 (courtesy of Mattéo Delabre) libcrypto.so.1.0.2.zip @bbolker could you test it for us? Just unzip it and copy it over with (assuming connection via USB)

scp /path/of/uncompressed/libcrypto.so.1.0.2 root@10.11.99.1:/usr/lib/

Then start rmview and check if it works.

It's working here in my RM2.

Thank you so much!

bordaigorl commented 3 years ago

@juanfcocontreras could you maybe check if the file I posted for RM1 works on RM2 too?

juanfcocontreras commented 3 years ago

@juanfcocontreras could you maybe check if the file I posted for RM1 works on RM2 too?

Sure, and it's working too!

pks40 commented 3 years ago

Thanks a lot to all of you for working this out! It’s great that only one library has to be included ...

bbolker commented 3 years ago

posted file works for me on RM2, hurrah.

pl-semiotics commented 3 years ago

Hi all, I apologize for the delay on this issue---I don't have 2.6, and I don't know where an updated sdk might be, so it's nontrivial to build a version that natively links against the new library. I'm hoping that an updated sdk gets released at some point, but if not I will look at hacking something together once I have the update. Unfortunately the rM for some reason doesn't have the normal libX.so.<major> symlink and SONAME for crypto, which causes this issue even though the major version of libcrypto hasn't changed.

Using the old library will probably work in the meantime, although I'm hesitant to recommend this, since there are various potential issues. If you want to avoid cluttering up /usr/lib, you should be able to drop it anywhere you like and use LD_LIBRARY_PATH=/path/to/directory/containing/it when launching the vnc server, by the way.

bordaigorl commented 3 years ago

@pl-semiotics thanks for the remarks. In which ways the workaround could cause problems? Do you think simlinking the old version to the new one would work? Would linking the library statically (wrt the old version) work and avoid the potential problems?

bordaigorl commented 3 years ago

I got the update and can confirm that ln /usr/lib/libcrypto.so.1.1 /usr/lib/libcrypto.so.1.0.2 works on my RM1. I do not know if there are bad side effects of using the "wrong" version. I'd like to push a fix for rmview but I am not sure which to pick. Shipping yet another binary seems a bit ugly, symlinking seems better but I don't know if it's safe (and I am not sure which version of libcrypto is on the RM2). Any ideas?

bbolker commented 3 years ago

In case this is useful to anyone (RM2):

pl-semiotics commented 3 years ago

@bordaigorl Since a new version of the SDK seems to not be forthcoming, I'll check what static linking does to the size of the executable tomorrow and look at how to support both versions if that doesn't work.

pl-semiotics commented 3 years ago

(The real trouble is that there was a bunch of API/ABI breakage in libcrypto 1.1, and of course without the updated SDK or similar, I don't have header files for the new version's interface. Depending on whether any codepath encounters a breaking change in the library ABI, everything might work fine or might have unpredictable behavior. I can of course try building my own copy of libcrypto from upstream and hoping that the one on the rM doesn't differ much, and see how that works; but I'd really rather have an updated "official" sdk, where one knows that the interfaces in it are supposed to correspond to the libraries on the tablet.)

bordaigorl commented 3 years ago

The real trouble is that there was a bunch of API/ABI breakage in libcrypto 1.1

Yes I was checking that as well...the symlinking seemed to work at first, but on further testing the behaviour seemed less reliable...not a definite answer, but there.

Having the two versions side by side should be fine though?

pl-semiotics commented 3 years ago

Oh, I forgot. I actually disabled openssl in libvncserver, so libcrypto is only being used by the few functions on the rm2. Actually, has anyone actually seen this issue on the rm1? I can't reproduce it on the rm1.

The situation is complicated on the rm2 by the fact that it seems that the sdk doesn't include a static libcrypto.a either.

Yes I was checking that as well...the symlinking seemed to work at first, but on further testing the behaviour seemed less reliable...not a definite answer, but there.

Yes. On the rm2, libcrypto is used to hash the binary that we are patching for caching purposes, but the relevant functions were renamed between 1.0.2 and 1.1.0, which makes building something that would work with the new version rather nontrivial. Even though this is only a small bit of functionality, I used libcrypto because it is already present and resident in memory on the tablet. Tomorrow, I will look at either trying to build my own 1.1.0 headers and hoping that they correspond to the library on the tablet, or looking for a replacement (small) static library with the md5 functions that are all I'm currently using.

Having the two versions side by side should be fine though?

This should be basically fine, I think.

bordaigorl commented 3 years ago

libcrypto is only being used by the few functions on the rm2.

Ok now everything makes more sense! I was puzzled by the fact that symlinking the old library to the new seemed to work on RM1 given the non-backward compatibility of it (see here). Indeed this is because on RM1 the library is not needed at all! I should have tried first...I think the unreliability thing I was mentioning was generated by using WiFi instead of USB.

looking for a replacement (small) static library with the md5 functions that are all I'm currently using.

Is running md5sum a stupid idea? I guess you are doing it only once at startup...would it be so horrible?

Keep me posted, I'll make the necessary changes to rmview as soon as we have a nice solution for RM2.

rfielding commented 3 years ago

Here is libcrypto for RM2 (courtesy of Mattéo Delabre) libcrypto.so.1.0.2.zip @bbolker could you test it for us? Just unzip it and copy it over with (assuming connection via USB)

scp /path/of/uncompressed/libcrypto.so.1.0.2 root@10.11.99.1:/usr/lib/

Then start rmview and check if it works.

It's working here in my RM2.

Thank you so much!

This works. I ran into this on my Linux system. I get a similar error on MacOS (where I can't get LiveView to work right). I expect when I am back to my M1 mac mini, I can get this to work. (But note that you have to be using conda for the M1 mac mini, or Python is all kinds of messed up for use with rmview.)

bordaigorl commented 3 years ago

@pl-semiotics any update on this?

bordaigorl commented 3 years ago

In case it helps, the new toolchains have been published: https://discord.com/channels/385916768696139794/386181213699702786/830914172640428053

reMarkable 1: https://storage.googleapis.com/remarkable-codex-toolchain/codex-x86_64-cortexa9hf-neon-rm10x-toolchain-3.1.2.sh reMarkable 2: https://storage.googleapis.com/remarkable-codex-toolchain/codex-x86_64-cortexa7hf-neon-rm11x-toolchain-3.1.2.sh

corwin-of-amber commented 3 years ago

I managed to run the published executable after downloading libcrypto.so.1.0.2 (rM2), and the server actually runs but the client shows the current image and never updates (using vnc-viewer installed via HomeBrew). Sometimes it also causes xochitl to hang and I need to restart it.

bordaigorl commented 3 years ago

@corwin-of-amber that sounds like issue #5 : the current version is not compatible with rm2fb, so you have to uninstall/deactivate rm2fb before running the vnc server

pl-semiotics commented 3 years ago

I deeply apologise for the delays involved on this issue.

@bordaigorl Thank you for the toolchain links! I have no idea where those were published, but I'd been looking around when this issue was first opened and couldn't find them---those should be precisely what is needed. I will make the minor modifications necessary as soon as possible and provide a new round of test executables that I am rather more confident in.

bordaigorl commented 3 years ago

@pl-semiotics gentle ping: can I help with anything to get this sorted?

pl-semiotics commented 3 years ago

I apologise again for the delay in resolving this issue. I believe that these binaries should work on both rm1 and rm2, both with and without rm2fb in the latter case, under the new system software revisions. If people with hardware/rm2fb could corroborate this, I will go ahead and make an official release.

bordaigorl commented 3 years ago

@pl-semiotics thank you so much for your amazing work! I only have RM1, I'll try to recruit somebody for testing on RM2. I will look at this next weekend.

wmkoolen commented 3 years ago

Dear @pl-semiotics,

On Remarkable2 with OS version 2.8.0.98, your new binary gives Using backend libqsgepaper-snoop Bad cache preamble 0 and it exits.

Your previous binary prints Using backend libqsgepaper-snoop 06/07/2021 12:25:05 Listening for VNC connections on TCP port 5900 06/07/2021 12:25:05 Listening for VNC connections on TCP6 port 5900 and continues to listen

Would any other information be helpful for you?

pl-semiotics commented 3 years ago

@bordaigorl Thank you! Apologies again for the long delays---I thought I'd sent something out for testing a while back, but it seems that it didn't quite make it.

@wmkoolen Thank you for testing. This is very strange: an internal consistency check to ensure the goodness of cached data is failing now, but this code should not have changed. A few questions:

wmkoolen commented 3 years ago

I moved .cache/libqsgepaper-snoop/ out of the way, and then the new binary worked. The output was

Using backend libqsgepaper-snoop
No cached info found for /proc/215/exe
Uncompressing extraction program
Running extraction pass 9695
06/07/2021 12:49:18 Listening for VNC connections on TCP port 5900
06/07/2021 12:49:18 Listening for VNC connections on TCP6 port 5900

... and listening. The viewer can now connect to it, and screen sharing works. All good.

The lines relating to uncompression/extraction only appear on the first run. What is being uncompressed/extracted here?

FWIW: I now have two files: .cache/libqsgepaper-snoop/[long hash] and .cache/libqsgepaper-snoop.old/[same long hash] that have the same first line in them, "libqsgepaper-snoop cached info v1", and a different second line that seems to be some binary format.

pl-semiotics commented 3 years ago

@wmkoolen

Thank you for trying that out; I am glad it worked.

The behaviour you mentioned is actually expected in one case: if you started xochitl, then you ran the old version, then you ran the new version without restarting it. Otherwise, I am not quite sure why it has happened---please try stopping and restarting the (new) vnc server to make sure that it still works.

Now the compression/extraction line and the issue are quite interrelated. Basically, in order to hook in to the xochitl binary the way we must on rm2, we must find out several bits of information about the layout of the process---chiefly the address of the sendUpdate function which we must hook. This is rather nontrivial, and in fact we use an entire arm emulator (unicorn, based on qemu) in order to do the necessary dynamic analysis. Since this process is a bit time- and space- heavy, but provides results that do not change (due to the lack of aslr on the rM), we cache the results in the cache directory I discussed above (indexed by the hash of the xochitl binary). Since we do this task only quite rarely, and unicorn is very large, the bit of code that does the extraction of this information is compressed----the decompression and extraction lines reflect that in the absence of cached data this process is being run.

The code takes quite some care to ensure that it is injecting into a process that it understands before doing anything, since there is a theoretical possibility for misbehaving userspace code to physically damage the eink screen on the rm2. So, in addition to the hash, some consistency checks are undertaken: chiefly, it ensures that the instructions that it overwrites when hooking sendUpdate are those that were there when the cache was first created (so that if, say, ASLR is turned on, it will fail harmlessly rather than overwrite something problematic). Of course, once those instructions have been overwritten, they are not the original anymore, and so in the naive case if you run the vnc server, kill it, and then run it again, it will fail with precisely this "bad cache preamble" error. Some care is taken to avoid this by checking some magic numbers to see if the only change is that we have already run one injection pass (in which case we may run another without fear), but the layout of the injected-code page changed slightly between the old and new versions. So, if you run the old version and then the new version, the new version will not find the magic numbers it is looking for when it notices that the old version has already patched the beginning of the sendUpdate function, and so it will fail with that error message in order to be cautious.

I think this scenario is the one that makes the most sense; hopefully it matches with what occurred?

I hope this answered your questions as to the extraction/uncompression lines and the existence of the cache.

pl-semiotics commented 3 years ago

Oh, and one other note: if this is in fact the issue, I think that returning the original contents of the cache directory to its place but restarting xochitl ought to also fix your issue and be the "more correct" thing to do (as your new cache may have quite the wrong information in it (in the form of the instructions injected by the old binary)).

pl-semiotics commented 3 years ago

@bordaigorl If rmview is still installing things, if/when this update is added, do please ensure that xochitl is restarted if this was an upgrade from the old version (or, I suppose, one could upon realising it is a format-changing update, try it out on the off chance that the old version had been run, and restart (or prompt the user to restart) xochitl only if it fails).

wmkoolen commented 3 years ago

Dear @pl-semiotics, Thanks for the explanation, that makes sense. I ended up discarding the cache and rebooting the tablet. All appears in good order. Many thanks!