Closed BPerlakiH closed 3 months ago
the listing of the categories are based on what is stored in the local DB
It's hardcoded? Where does the information comes from as this DB has to be filled at dome point?
It is not hardcoded. We are talking about a user that upgrades the app. Yes, the DB is filled in with the entries in the current AppStore version of the app. The problem is that we only check for the ZIM file ID if exists or not, and if it already exists in the DB, we won't update it. In this case we should probably update language value of it.
Here there is an architectural problem:
I don‘t know how to solve here the problem but ultimatively things shoukd be data driven, see https://libkiwix.readthedocs.io/en/latest/search.html?q=Categories&check_keywords=yes&area=default
For the rest I don‘t understand your issue:
We have the full meta data of each ZIM file in a local sqlite DataBase stored together with the application. The interesting part for this ticket is the fileID and the languageCode. An upgrading user already has this eg:
quote(ZFILEID) | ZNAME | ZCATEGORY | ZFLAVOR | ZLANGUAGECODE |
---|---|---|---|---|
X'746C75B67DAC87D962DBAB5542F6FE7B' | Wikivoyage | wikivoyage | nopic | en |
now the feed has the same file, with the same ID, but the language is: "eng":
<entry>
<id>urn:uuid:746c75b6-7dac-87d9-62db-ab5542f6fe7b</id>
<title>Wikivoyage</title>
<updated>2024-02-13T00:00:00Z</updated>
<summary>The collaborative travel guide</summary>
<language>eng</language>
<name>wikivoyage_en_all</name>
<flavour>nopic</flavour>
<category>wikivoyage</category>
<tags>wikivoyage;_category:wikivoyage;_pictures:no;_videos:no;_details:yes;_ftindex:yes</tags>
<articleCount>32417</articleCount>
<mediaCount>34</mediaCount>
<link rel="http://opds-spec.org/image/thumbnail"
href="/catalog/v2/illustration/746c75b6-7dac-87d9-62db-ab5542f6fe7b/?size=48"
type="image/png;width=48;height=48;scale=1"/>
<link type="text/html" href="/content/wikivoyage_en_all_nopic_2024-02" />
<author>
<name>Wikivoyage</name>
</author>
<publisher>
<name>openZIM</name>
</publisher>
<dc:issued>2024-02-13T00:00:00Z</dc:issued>
<link rel="http://opds-spec.org/acquisition/open-access" type="application/x-zim" href="https://download.kiwix.org/zim/wikivoyage/wikivoyage_en_all_nopic_2024-02.zim.meta4" length="254736384" />
</entry>
The problem is that after downloading and parsing the feed, it finds the entry by ID, and skips it, since we already have that in the DB, and the language field won't get updated.
When we want to display the categories, it searches by the current language using a query more or less:
SELECT ... WHERE ZLANGUAGECODE IN ["eng"]
and nothing is found.
So this is the category listing applying to the local library?
In general, the problem is worse than it looks like. The underlying question is: Once we have downloaded a ZIM, should we display the metadata linked to the ZIM coming from the feed, or the one saved at the time the book has been introduced in the local library?
But I guess ultimatively, this is more a question for the libkiwix. @mgautierfr How does the libkiwix behaves on this?
@BPerlakiH For the monent, I recommend to introduce a temporary fix which update the lang in local library when necessary. This fix to be removed (put a comment on code and create dedicated issue) in a few releases.
I have done a fix for this, which works for both new users = fresh installing the latest version of the app, and for updating users = those who have the AppStore version already
So the problem also occurred, because the application was relying on the fact that the ZIM fileID will ultimately identify the file,
whereas the feed content changed at some point, and now the feed is using alpha-3 language codes.
The other question is do we know why/when the feed was updated? We need to be more cautious about similar changes, as they can break the apps that are already in the AppStore.
Another thing is: if we change the feed in a similar fashion for the other fields, we might bump into a similar problem again...
One posible solution for avoiding these cases is a versioned API eg.: ...content/v1/feed
and ...content/v2/feed
, but that also comes with additional maintenance overhead on the back-end side.
We have about 25 fields for each ZIM file in the DB eg:
Z_PK | Z_ENT | Z_OPT | ZARTICLECOUNT | ZHASDETAILS | ZHASPICTURES | ZHASVIDEOS | ZINCLUDEDINSEARCH | ZISMISSING | ZMEDIACOUNT | ZREQUIRESSERVICEWORKERS | ZSIZE | ZDOWNLOADTASK | ZCREATED | ZCATEGORY | ZFILEDESCRIPTION | ZFLAVOR | ZLANGUAGECODE | ZNAME | ZPERSISTENTID | ZDOWNLOADURL | ZFAVICONURL | ZFILEID | ZFAVICONDATA | ZFILEURLBOOKMARK |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | 4 | 1 | 8 | 1 | 1 | 1 | 1 | 0 | 24 | 0 | 313540608 | 0 | 2021-01-19 00:00:00 +0000 | ted | Designer Isaac Mizrahi dashed off these stylish, breezy notes on why these 6 TEDTalks are a source of inspiration for him. | eng | Isaac Mizrahi: Talks that are in fashion | ted_en_playlist-isaac-mizrahi-talks-that-are | https://download.kiwix.org/zim/ted/ted_en_playlist-isaac-mizrahi-talks-that-are_2021-01.zim.meta4 | https://library.kiwix.org/catalog/v2/illustration/80c9f981-09ea-69a9-a87e-10aa8283ba05/?size=48 |
@BPerlakiH What id the ZIM FileID? The only ZIM id I know is the uuid metadata and only this should be used https://wiki.openzim.org/wiki/ZIM_file_format#Header. This one never changes and shoukd be used to identify a specific published ZIM file.
After disscussion, the solution should be here to run once (only after update) a special function to remove old online (not local) entries of the DB, so at next sync, then everything will be repopulated. Of course once the DB is cleared from online entries, something should be done to redownload the online feed ASAP.
In general, the problem is worse than it looks like. The underlying question is: Once we have downloaded a ZIM, should we display the metadata linked to the ZIM coming from the feed, or the one saved at the time the book has been introduced in the local library?
But I guess ultimatively, this is more a question for the libkiwix. @mgautierfr How does the libkiwix behaves on this?
I would say that libkiwix doesn't behaves at all on this.
The kiwix::Library
is either feed with opds stream or library.xml
[*] (saved library). So metadata come from the input.
And on an existing (potentially empty) Library
you can also add a book (and so metadata come from the book).
At the end, libkiwix doesn't choose. It is an application decision.
[*] When loading a library.xml
, we have a boolean trustLibrary
(true by default).
When it is true, we trust the library metadata. If false, we open the books to get the metadata from them. (This can be pretty low, so it is false by default to allow quick startup of kiwix-serve (and not open hundreds of zim files for "nothing")
Fix is included in: https://github.com/kiwix/apple/issues/668
I did test the following scenario:
We have the following problem, the user's default / selected language will get updated properly, once the catalog is fetched, yet the user will still face an empty list of categories. The reason is the following: