kiwix / kiwix-android

Kiwix for Android
https://android.kiwix.org
GNU General Public License v3.0
861 stars 443 forks source link

v. 3.8.1+3.9.0: Error on opening a split zim file on external storage #3605

Closed AxelBoldt closed 4 months ago

AxelBoldt commented 9 months ago

I'm running Kiwix v. 3.8.1 build 7230801 on a BLU G90 phone running Android 10.

I have a Wikivoyage zim file, in the Android/data/org.kiwix.kiwixmobile/files/Kiwix folder of the phone's internal storage, and a split Wikipedia zim file (.zimaa to .zimai) in the top level of the SD card external storage. Both are shown in the app's Library, and both used to work fine.

Today I noticed that opening the Wikipedia file results in the error "The selected file is not a valid zim file". I get the same error if I try to add the .zimaa file manually to the library. The (unsplit) Wikivoyage zim file continues to work fine.

Log file is attached. kiwixlogs.txt

AxelBoldt commented 9 months ago

I just downgraded to Kiwix 3.5.0 and was able to add the split Wikipedia zim file to the Library. Everything works fine. I did not try any other Kiwix app versions.

kelson42 commented 9 months ago

@AxelBoldt Thank you for your bug report. I believe this is a duplicate of #3577 which is fixed in nightly and in 3.9.0 to be released later today. Could you please give a try and reopen the ticket if this is still not fixed?

AxelBoldt commented 9 months ago

@kelson42 I installed version 3.9.0 and I get the same ""The selected file is not a valid zim file" error if I try to add the split Wikipedia .zimaa file from my SD-card.

I then tried version 3.7.1 and there everything is fine.

kelson42 commented 9 months ago

@AxelBoldt What is this ZIM file exactly? How have you split it? Recent versions of Kiwix can handle with split ZIM files, but this can not ben done so arbitrary like before.

AxelBoldt commented 9 months ago

It is wikipedia_de_all_novid_2018-11.zim. I have used it for years, split with Linux split into .zimaa - .zimai, each file has less than 4GB because of the FAT filesystem restriction.

kelson42 commented 8 months ago

@MohitMaliDeveloper @mgautierfr I believe that if you open a split file with a fd, then it won’t be able to deal with the other (than the first one) chunks, and it will fail. Should be confirmed. See https://github.com/openzim/libzim/issues/841

mgautierfr commented 8 months ago

I confirm that opening a split file using fd will not work.

But I was thinking that we were trying to open by fd only for custom app. When opening from (a) file(s) on sdcard we should pass the path, no the fd (I don't know what android code is doing).

kelson42 commented 8 months ago

@MohitMaliFtechiz What is going on here? Do we open by fd or by path?

MohitMaliFtechiz commented 8 months ago

@kelson42 we are opening the ZIM file via filePath in all features except by opening the ZIM file via deep linking(When a user clicks on any ZIM file in their storage).

MohitMaliFtechiz commented 8 months ago

@kelson42 I confirm this bug, I split this zim file with zim-tools 3.1.0 in Linux, and that ZIM file is not loading with 3.9.1 but working correctly with 3.7.1. In both versions, there is a difference we are using older libkiwix(10.1.1) in 3.7.1 and in 3.9.1 we are using libkiwix(1.0.0) which adapts the new wrapper.

The error message libkiwix showing is error Dirent pointer table outside (or not fully inside) ZIM file. @mgautierfr Is there any change in the new wrapper that it is not loading the same splited ZIM file in the new wrapper? Or did I splited it with the wrong zim-tools version?

We are not using fd here to open the ZIM file. We are using the filePath to open this file.

kelson42 commented 8 months ago

@MohitMaliFtechiz please give all the details. All filenames, sizes and the libkiwix primitive you use and the values given.

MohitMaliFtechiz commented 8 months ago

@kelson42, @mgautierfr I am using the https://download.kiwix.org/zim/wikipedia/wikipedia_ar_all_nopic_2023-11.zim spilited zim file.

After splitting the zim file, all parts are like this:

  1. wikipedia_ar_all_nopic_2023-11.zimaa (2.1 GB (2,14,74,74,503 bytes)
  2. wikipedia_ar_all_nopic_2023-11.zimab (2.1 GB (2,14,74,71,095 bytes)
  3. wikipedia_ar_all_nopic_2023-11.zimac (1.3 GB (1,28,14,26,182 bytes)
  4. wikipedia_ar_all_nopic_2023-11.zimad (3.1 GB (3,14,05,16,010 bytes)
  5. wikipedia_ar_all_nopic_2023-11.zimae (0 bytes)

I am opening this path with both older libkiwix and new java-libkiwix create: /storage/58DD-0C22/Download/wikipedia_ar_all_nopic_2023-11.zimaa which is the first part of the spilited zim file.

With older libkiwix it is working fine. But with the new java-libkiwix there is an error while opening this zim file.

Error logs:

create with path = /storage/58DD-0C22/Download/wikipedia_ar_all_nopic_2023-11.zimaa 
error thrown by libkiwix = Dirent pointer table outside (or not fully inside) ZIM file.
kelson42 commented 7 months ago

@mgautierfr ?

Jaifroid commented 6 months ago

@MohitMaliFtechiz There shouldn't be a file with 0 bytes in -- that looks like a splitting error?

MohitMaliFtechiz commented 6 months ago

@MohitMaliFtechiz There shouldn't be a file with 0 bytes in -- that looks like a splitting error?

@Jaifroid First I also think, this was a splitting error because there was a zim file part with 0 bytes so at that time I again tried to split the ZIM file but again it splited this zim file like this.

@mgautierfr Is there any change in the new wrapper that it is not loading the same splited ZIM file in the new wrapper? Or did I splited it with the wrong zim-tools version?

I split this ZIM file with zim-tools 3.1.0 that is available for the linux. Did i spilt the zim file with wrong version?

However, this spilited zim file still working with Kiwix version 3.7.1.

kelson42 commented 5 months ago

I split this ZIM file with zim-tools 3.1.0 that is available for the linux. Did i spilt the zim file with wrong version?

I don't think so, but the fact that we have an empty chunk at the end is clearly a bug... although AFAIK this should not impact anything. You can keep the wikipedia_ar_all_nopic_2023-11.zimae or not, that should work in both cases.

kelson42 commented 5 months ago

@MohitMaliFtechiz Can you please try to open wikipedia_ar_all_nopic_2023-11.zim in place of wikipedia_ar_all_nopic_2023-11.zimaa? To me it looks like that the handling of splitted ZIM files has been broken at some point in Kiwix Android.

Putting back in 3.10.0 milestone1

kelson42 commented 4 months ago

See https://github.com/openzim/libzim/issues/879 as well for a long term fix at libzim level

MohitMaliFtechiz commented 4 months ago

@MohitMaliFtechiz Can you please try to open wikipedia_ar_all_nopic_2023-11.zim in place of wikipedia_ar_all_nopic_2023-11.zimaa? To me, it looks like that the handling of splitted ZIM files has been broken at some point in Kiwix Android.

@kelson42 I have tried to open wikipedia_ar_all_nopic_2023-11.zim in place of wikipedia_ar_all_nopic_2023-11.zimaa but libzim can not open this zim file it showing the same error as previously, so the functionality is not broken on the android side. It seems libzim is instanciating a Single reader instead of a MultiPart file reader for split zim files as well, since libzim shows this error for those zim files that are broken.

mgautierfr commented 4 months ago

Just to be sure, have you try to open wikipedia_ar_all_nopic_2023-11.zim while keeping the files under the name wikipedia_ar_all_nopic_2023-11.zimaa (and ...ab, ...ac, ...) ?

MohitMaliFtechiz commented 4 months ago

@mgautierfr I have tried these scenarios:

  1. I have tried it while keeping the .zimaa, .zimab etc in storage. That is normally zimsplit tool provides.
  2. Then I have tried with .zim while keeping the .zimab, .zimac ...
  3. Then also tried your asked scenario, where all the parts are like this: .zim, .zimaa, zimab...
kelson42 commented 4 months ago

@mgautierfr I'm a bit puzzled by the situation. It not only that it's not clear what is going on... but also it's not clear how we have managed to build - what looks like - a regression.

Not sure either to know what is the best approach, but I tend to think that we should urgently simplify the handling of the chunks at libzim level... to then make a patch release.

Jaifroid commented 4 months ago

In Kiwix JS, we were never able to get the libzim WASM to handle loading of split ZIMs, despite passing them in what we consider to be a correct way to the WASM file system. It might be related in some way to this issue, if the error is indeed at libzim level. See https://github.com/openzim/javascript-libzim/issues/16.