MrCyjaneK / jwapi

FOSS replacement for JW Library app, that works on Ubuntu Touch, Debian (mobian & droidian), android, and any other os! Uses jw.org api directly.
https://mrcyjanek.net/projects/jwapi/
GNU General Public License v2.0
33 stars 0 forks source link

Need help with understanding JWPUB format #1

Open MrCyjaneK opened 3 years ago

MrCyjaneK commented 3 years ago

I have no idea how to get words out of Content in .db file located in jwpub archive. what I know. So any help is needed.

orangethewell commented 3 years ago

Hi! I had this idea (scrapping jwpub files) somedays ago and was searching for anything about these JW Library files. Appearly, these files have some linking directly with jw.org, but even then, I don't got anything about how this linking works. By the way, I was thinking that bytecode should be a id from words table too, but I also don't think this is a directly id, maybe have some instructions that JW made for it.

You're a Jehovah Witness?

MrCyjaneK commented 3 years ago

I even sent a couple of emails with a request for documentation, but got a response that said that they are unable to answer my question from this email address. So my idea was to call the number from https://www.jw.org/en/jehovahs-witnesses/contact/united-states/, but recently I didn't had much time, so I didn't do that.

And yes, I am

MrCyjaneK commented 3 years ago

I've put a lot of time into understanding this format, but still no results worth showing.

It's sad that most of the new publications are PDF/JWPUB only, PDF just doesn't scale well, and JWPUB is ugh.,

I still have an idea - scraping wol.jw.org but I'm against sending hundreds/thousands of request (every image, article, quote, source) to get one publication.

orangethewell commented 3 years ago

Haha, I don't think they would simply give us their code, sadly. Anyways, scrapping wol.jw.org would actually works, but it's the worst idea considering that we should add a lot more code and change almost everything. (Considering too this will make the project a lot more heavy for low-end systems)

After all, all we can do is trial and error. We have at least a hint, that files works like a Epub, with XML files inside it, the difference it's hard modified and for some reason there's binary code that isn't a match with a list of words.

I will try doing something with my knowledge with Python, I don't know that I will help in something, but at least I will try for fun. I really like the fact to use JW Library in PC, and it's sad that Watchtower don't have ported it to some Linux distro. I don't think it will go for a long time, maybe some day they release a version for a famous Linux distro.

A final question... I saw with your project that you use the app API from JW but, this is allowed? Isn't a violation from some of the App's terms of use?

orangethewell commented 3 years ago

Hello! So, I made some experiencies with the JWPUB file to know how it works and I think I got some hot things working! First of all, content is directly related with the page and don't accept something new in (maybe because content have a fixed size bytes and I inserted more than that? I don't know). Furthermore, the Words table don't work the way we thought, I changed a word in this table and all I got is the way I find it on the book, now I need to search by "subjecters" instead "subject", and after all, the word in the documents keep the same.

So, after all, I got a "How to Remain in God's Love" Book with the subjects section with title "Edited Subjects" and a blank "Letter from the Governing Body".

EDIT: I read the documentation that you gave, maybe the begin and end can be the initial byte and the final byte to be converted, but there's the question: Converted in what if it's not an index from words table?

MrCyjaneK commented 3 years ago

Wow! That's great! I lost so much time with the Words table.. So you are saying that Contents is directly related to the content? Not just reference the Words?

Have you seen things below the sentence Huh It's quite short. in the docs? https://raw.githubusercontent.com/MrCyjaneK/jwapi/master/docs/jwpub/index.md

You also need to have there:

What is the news from God? translates to:

Decimal 1246 616 1131 758 474 499
Hex 4de 268 46b 2f6 1da 1f3

Which is quite short, so my guess was that it use Words table. Maybe it store rendered publications somewhere in cache, that's why changing the table didn't change the content?

Or another scenario the Contents is compressed in some way..

orangethewell commented 3 years ago

Okayyy I think I got a problem with the customized JWPUB and I don't know what exactly was charging it.

I saw what was in the jwpub converting doc before and yeah, it could be it but... There's something strange with it and I don't know what exactly happened.

I changed a lot of things in the original db because I thought I was compacting it with a new jwpub file with my code but no, and when I fixed that, I had changed a lot in the DB and I think I got a corrupted publication (Or modified so long that it's don't load anything). Remembering that I changed just one content column. But this is really strange, I didn't saw that yesterday but even then it's strange how it's going.

After all, there's a lot of things working behind the jwpub specifications, there's even a schema specification for publication view and, with words table, there's some strange tables that's is like a pre compiled search. I'm really thinking about what some of a reading program forum responded to a request to create a support for the JW files, they said these files have requests for the JW API. I don't trust in everything, but this really was stuck in my mind, but even then doesn't make any sense, why a 100mb or + will need from JW? And if it's, how the pioneers book are distributed?

MrCyjaneK commented 3 years ago

I'll check the network thing tonight.. I'll download a publication and just watch for the traffic in burp suite, that should clarify if the requests are sent there or not.

MrCyjaneK commented 3 years ago

So first of all, I had some problems with android studio, then it was just late and I forgot to reply. After downloading publications there were no requests (execept for few images, that were unrelated to the publication)

MrCyjaneK commented 3 years ago

Haha, I don't think they would simply give us their code, sadly. Anyways, scrapping wol.jw.org would actually works, but it's the worst idea considering that we should add a lot more code and change almost everything. (Considering too this will make the project a lot more heavy for low-end systems)

Yea.. but if we fail that's the only option.

After all, all we can do is trial and error. We have at least a hint, that files works like a Epub, with XML files inside it, the difference it's hard modified and for some reason there's binary code that isn't a match with a list of words.

Not really - it can be converted on the go, and then just kept in some html format.

I will try doing something with my knowledge with Python, I don't know that I will help in something, but at least I will try for fun. I really like the fact to use JW Library in PC, and it's sad that Watchtower don't have ported it to some Linux distro. I don't think it will go for a long time, maybe some day they release a version for a famous Linux distro.

That's sad :( I wish that there would be a decent watchtower library app made with gtk ;p

A final question... I saw with your project that you use the app API from JW but, this is allowed? Isn't a violation from some of the App's terms of use?

Since I don't reupload the content, it is legal, but I'm not a lawyer

https://www.jw.org/en/terms-of-use/

and even if it's against the terms.. sigh. I'm not switching back to android, so I'll continue to develop this app.

(sorry for late reply.. I missed this comment)

orangethewell commented 3 years ago

No no! It's okay, brother! I too don't have so much time for searching more these days, after all, I'm still have 15 years old and have some homework to do here for school. ^^

But I will still following the project flow, if I can get something new here, I make a new response on this issue.

And if you can't got any new thing from the JWPUB convertion, you still have a more easy task to do, like the video player :) (I really like the way the PC JW Library app can be easily "hacked" to have a new video on, lol)

MrCyjaneK commented 2 years ago

After spending hours on this thing, I'll not continue to reverse engineer the JWPUB format, until somebody do that.. for me.

For now I'll try to move to flooding wol.jw.org apis and getting the publications page by page (thanks for abandoning epub btw).

image

MrCyjaneK commented 2 years ago

Ignore what this weirdo said.

It's here: https://github.com/Miaosi001/JW-Library-macOS/blob/main/JWLibrary/SubViews/PubbView.swift

MrCyjaneK commented 2 years ago

https://github.com/Miaosi001/JW-Library-macOS/blob/main/JWLibrary/Robe/AppuntiDel2019.png

mjacobus commented 1 year ago

@MrCyjaneK did you figure out how to read Document.Content?

MrCyjaneK commented 1 year ago

@mjacobus https://github.com/darioragusa/JW-Library-macOS/issues/1#issuecomment-1079989526

MrCyjaneK commented 1 year ago

I'm not working on this app anymore, spending time on open source alternative to something that is clearly using DRM when it shouldn't (can somebody give me one single reason for which it is worth to encrypt such content when it is freely available?) Also I don't feel like playing some sort of cat and mouse thing when somebody can just change the way api sends publications and cut support for earlier versions.

And the elephant in the room. WHY isn't the app open source in the first place?

Until somebody gives me answers to that questions I'm not going to work on this project. wol.jw.org is enough for me.

</project>

darioragusa commented 1 year ago

Security? If anyone could get a publication and easily edit it the risk of spreading misleading information would be very high.

MrCyjaneK commented 1 year ago

@darioragusa As they can do with .epub, .mobi, and .pdf.

Also there is a tool for that used widely in the internet, you can sign things with PGP that would allow 3rd party apps to be developed and would cause less risk (currently we can edit the publications - drm is defectivebydesign.org).

darioragusa commented 1 year ago

@MrCyjaneK I know you can edit the other formats without problems but the most of us use the JW Library app. I download a jwpub knowing that it comes from jw.org or the app and I trust the content. It's not a random txt file sent by a random guy opened with Word or Adobe Reader which may or may not contain the correct informations. An example: if I send to my grandma an EPUB she my be not able to open it but, if I send a jwpub she taps the file, a trusted app she always use show up and for her it's all ok: a normal article with the reliable content that is supposed to be there. A jwpub can still be edited but it's not a thing that anyone with basic knowledge of Word can do: less editors -> less edited files. Perhaps I'm totally wrong but those are my two cents.

MrCyjaneK commented 1 year ago

less editors -> less edited files. Perhaps I'm totally wrong but those are my two cents.

The thing is current method allows editing, and signing would make it impossible while allowing moders like us to easily read the content

darioragusa commented 1 year ago

I don't know much about signing files, but I guess that the app should have a key and using this key with (something, idk) they should get a value. It's like checking the hash? If a bit changes the value is different?

MrCyjaneK commented 1 year ago

It's like checking if the content was modified, the content can be signed to verify that it was created by somebody and after modding it the signature will not match. It's like encrypting but you can see the content and can't modify it.

darioragusa commented 1 year ago

Ok, but this way they shouldn't save the signatures for every version of every article in every publication in every language?

MrCyjaneK commented 1 year ago

pgp signatures do not add a lot of extra size to publication so I don't consider this a problem. (hence you could sign a sha512sum of publication and get similar result) + you can sign them as they are served to download.

darioragusa commented 1 year ago

If the signature is stored with the publication what stops me to change it?

MrCyjaneK commented 1 year ago

You can change it - you can even sign it with your key but it will be invalid

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

This is a message I have decided to sign, try to mess with it and it will no longer be signed.
-----BEGIN PGP SIGNATURE-----

iQGzBAEBCAAdFiEE0gTaRRUXZfyrr8PQPD6SA9PleeEFAmO36L8ACgkQPD6SA9Pl
eeERiAv+MXm2VjIZMvOgXwKT5bDmwMpfK8liOdT/IhoFvNsTwMiWUQRHzp12OJtz
U+V26gq6lmBJsKsyij6AAvefy048mAzGnAMRR5c9uqkYs2R66jqUIRNERCE2XKdu
uiJAhmpMqNughA0/h19/As1xCrepZpo+W1SEE8yEPZp13eZ0gylmS0pBqXR5QcHB
JNAIMV84xOAntQNe2dzs6lBhhWdF3EvE5L50so2EiXGulr5mIdPwIkaCUSQIZYRd
2aWLwcA4j8ZN/UfY6YbCyhSyH5Fm4WXZ17tsPSuOqBE7QhW100gPiQjPDGc5ZUwN
SjvRIxrvCZ9rPg/PQnAOIgxALilBW3y6Jaq73XTBFaOArkOxmWh8rFhL7OkMdyW7
ewpAVjU90ChYEJ17BZpM+cSYIYcRwsYdtNQcQVl1fViBFlBFY1PEm4mvbbHK4GLQ
aRBnSsTbabNQQLij3hk/Wc9RLEe49pk/tmeDlqrtF5ELbFWRBtM0R63H3qfXkEL7
vFqGsJB4
=gBZ4
-----END PGP SIGNATURE-----
MrCyjaneK commented 1 year ago

obraz obraz

darioragusa commented 1 year ago

Rephrasing: what stops me to change it and sign it again with their key? Like how we got this value 11cbb5587e32846d4c26790c633da289f66fe5842a3a585ce1bc3a294af5ada7, someone will leak the key used to sign.

MrCyjaneK commented 1 year ago

The fact that public key used to verify is available to as and private key is being kept secret offline and used to sign publications, you can't sign something with public key

darioragusa commented 1 year ago

Oh, thanks, got it. The only thing I can think of is that I don't know how feasible it is to keep it offline, but yes, in theory it would be better.

MrCyjaneK commented 1 year ago

It works just fine offline - and it's commonly used in all package managers on linux that I have used, and to provide live iso's of systems.

anyway I just wish for a simple docs of that format.. or not abandoning good old .epub at least...

orangethewell commented 1 year ago

I've almost got a fully working reader for JWPUB format, or at least what you guys had got. There are just two issues that stop me from publishing it: 1 - Legal issues with jw.org terms of service (After all, using the key they got is basically reverse engineering and after recent updates of terms of service of JW Library, they said that we can't use part of app code) Note about this: There's a jw developer support mail, but I couldn't get any response from them 2 - I just didn't finished the project at all, but the base code for decoding the content is working and it's included on my project made in Rust

Even then, my use would be more personal, so until I post anything on internet, there's no legal issues after me. Or, maybe I can post the production program, without infringing any legal rights, it's just a reader, don't make any access to the official site API after all

MrCyjaneK commented 1 year ago

@orangethewell ianal and this is not a legal advice BUT go ahead and publish it if you want and feel free to link it here.

I'm not working on my project and not using official apps for reasons stated above https://github.com/MrCyjaneK/jwapi/issues/1#issuecomment-1369734896, but if some open source alternative would appear I'd be very happy to use it.

Worthington412 commented 1 year ago

I'd like to be able to modify jwpub files so i can read other books during the meeting in the app, and no one would be the wiser. I've been sitting through close to 40 years of meetings, assemblies, KMS, etc. and I'm so sick of the same repetitive junk that I'd like to be able to read something interesting without the church lady sitting watching my every move realizing. Sounds like watchtower has prevented this from being a plausible way of making meetings bearable. Not surprised.

gabriel-elesbao commented 1 year ago

I'd like to be able to modify jwpub files so i can read other books during the meeting in the app, and no one would be the wiser. I've been sitting through close to 40 years of meetings, assemblies, KMS, etc. and I'm so sick of the same repetitive junk that I'd like to be able to read something interesting without the church lady sitting watching my every move realizing. Sounds like watchtower has prevented this from being a plausible way of making meetings bearable. Not surprised.

if you go to 40 years and are dissatisfied, simple, ask to leave, no one is obliged

orangethewell commented 1 year ago

Some news from my rusty implementation, lol:

anyways, I got energy again to this project after my ui library working good again with me, any news I got, I'll post there

MrCyjaneK commented 1 year ago

Good job @orangethewell. Hit us with results once ready ;)

orangethewell commented 1 year ago

I published Open Witness Library on a public repo. Isn't 100% working, but the base thing is working with images and a document summary. You can track the project development here. :)

OBS: Sorry, by a mistake I made, it's only reading PT-BR pubs ("T"-tagged jwpubs), I will send a commit to fix it fast as possible.

MrCyjaneK commented 1 year ago

@orangethewell that is awesome! I'll check it out when I'll have some time

livrasand commented 11 months ago

@MrCyjaneK Do you think PGP works? where could it be embedded? in the manifest.json? I like PGP

MrCyjaneK commented 11 months ago

It could be embedded in app. Hence I could even implement it if I got access to app source and the backend. But no chances for that I guess.

livrasand commented 11 months ago

I am currently in communication with Bethel, in the company of my circuit overseer, I would like to comment on what you have said, if you want and are willing, I can give you the credit, but could you give me more details on how you would use the PGP?

If you don't feel like it, or don't want to, don't worry, I understand.

MrCyjaneK commented 11 months ago

@livrasand

  1. Get rid of the pointless DRM - encryption that is currently in the JWPUB format
  2. Sign the files using PGP key (private key on server that signs all publications)
  3. Embed that publickey in JW Library app
  4. Verify upon download if JWPUB downloaded matches the signature

Also, if you are that far (I tried contacting someone responsible for the app for 2 years to say that there was a RCE issue in the JW Library app and failed to reach anybody who would care about it), consider asking then about opensourcing the app. There is nothing to lose since we don't make profit on the app and I'd really like to patch some features and add some modifications to the app functionality. Keep me updated on how it goes.

livrasand commented 11 months ago

Perfect, thank you very much, I'll give your credit (although I don't know exactly how, maybe I'll mention your nickname), I'll notify you of Betel's response...

orangethewell commented 10 months ago

anyone know where is the comment with someone pointed every category found on jwpub files? Like books (bk), watchtowers (wt), etc.

arthurwweber commented 10 months ago

After doing some digging into this matter there are a few observations I wish to make:

MrCyjaneK commented 10 months ago

Friendly reminder that security by obscurity is not security.

Not to mention that JW Library as an app is just a terrible piece of code that tends to run slow and register many handlers for different kind of files. I'd honestly develop a app on my own but with the attitude from the JW Dev team I feel that it will be a waste of my precious time. Instead I'm just using wol.jw.org on my phone (since the official app won't run, and I can't create my own).

And as I've said in some other issue - there is no point in me spending time on reverse engeenering this abomination with jwpub extension.

Being hostile towards other developers is a terrible thing to do, and it honestly feels wrong to encrypt JWPUB, why would you force people to use your own baddly.

If you try decompiling the latest version of JW Library, all identifiers are obfuscated and it's much more challenging to parse through the code.

For that exact reason I'm considering JW Library app a malware (hence it malfunctions frequently), why would you make it so hard for a group of a few people who just want to read the publications in their own way? What is in there that you want to hide? Why don't you simply provide an up-to-date .epub on the site?

All the publications have for the app to check their integrity is the SHA256 hash of the payload in the ZIP container, which is stored in manifest.json.

So basically they store information about if the file is correct inside of that file? Lol.

MrCyjaneK commented 10 months ago

also it's been more then 2 years since I've tried to report a security issue in JW Library. So yea I ain't using that.

Not to mention that latest bible is available only through the jwpub format - together with other books.

I just hate the decisions that developers in the app dev team make. Like why not use widevine and play integrity in the app? And web integrity api in the website? After that it will be top 1 most unaccessible website for people who make it better.

So one last thing - if you don't deliver a good product (which JW Library app is not) then at least don't focus on making it harder for people to fix what you delivered broken.

disgusting.

livrasand commented 10 months ago

anyone know where is the comment with someone pointed every category found on jwpub files? Like books (bk), watchtowers (wt), etc.

Yes, in fact I have it and I published it a long time ago. I'll just leave it here in case it helps someone...

Publication Category | Publication Category Symbol | Traducción -- | -- | -- Bibles | bi | Biblias Insight | it | Perspicacia Index | dx | Índice Watchtower | w | Atalaya Awake | g | Despertad Kingdom Ministry | km | Nuestro Ministerio del Reino Books | bk | Libros YearBooks | yb | Anuarios Brochures | brch | Folletos Tracts | trct | Tratados Program | pgm | Programas Meeting WorkBook | mwb | Guía de Actividades Manual/Guidelines | manual | Pautas y Manuales Talk | talk | Bosquejos/Discursos Letter | letter | Cartas Web | web | Artículo de JW.ORG