kiwix / overview

https://kiwix.org
88 stars 14 forks source link

ZIM Filesystem Fuse Module #79

Open rgaudin opened 1 year ago

rgaudin commented 1 year ago

This ticket tracks the Kiwix GSoC 2023 Project ZIM Filesystem Fuse Module until code contributions and/or specific tickets requires creating its own dedicated repository (or falls into zim-tools).

Candidates, contributors, this ticket is the preferred location to discuss this project. Ask your questions here and mentors (@mgautierfr, @kelson42) shall respond.

Mandatory reads:


ZIM Filesystem Fuse Module

Objective: we need to create a filesystem fuse module that enables access to the content of a ZIM file, allowing users to view entries as files without using zimdump.

Technologies: C++, Linux internals, FUSE

Description:

Kiwix provides offline access to Wikipedia and other educational content through its ZIM file format. Inspecting ZIM files is very useful for developers and ZIM creators. While the zimdump tool exists, it is not as convenient and easy to use as a filesystem. Therefore, we want to make it easier for users to access the contents of a ZIM file by creating a (read-only) filesystem fuse module.

The ZIM filesystem fuse module will be written in C++ and will use the libzim and FUSE library to enable access to the contents of a ZIM file as if it were a regular (yet read-only) filesystem. The module will allow users to view the ZIM entries as a tree or folders and files, the latter being readable as regular ones. This will make it easier for users to access the content of a ZIM file and will provide a more user-friendly interface for exploring its contents.

Key Deliverables:

Skills required:

Difficulty: Hard. Expect 350 hours of work.

lyc8503 commented 1 year ago

Hello, I am a student from GSoC and I am interested in Kiwix's idea of persisting web pages offline.

I am familiar with C++ and Linux development, and I also have some open-source repos of my own on GitHub, now I want to join a bigger open-source project to experience the open-source community and help with the project.

I have already read the implementation of the zimdump tool, I think I can help implement the FUSE module. I would appreciate it if I could get the opportunity to work on this idea.

lyc8503 commented 1 year ago

Hello, I am a student from GSoC and I am interested in Kiwix's idea of persisting web pages offline.

I am familiar with C++ and Linux development, and I also have some open-source repos of my own on GitHub, now I want to join a bigger open-source project to experience the open-source community and help with the project.

I have already read the implementation of the zimdump tool, I think I can help implement the FUSE module. I would appreciate it if I could get the opportunity to work on this idea.

https://github.com/lyc8503/zimfuse I spared some time and wrote a tiny demo which utilizes libzim and libfuse3 to implement the readdir and getattr function, which allows users to use cd and ls to see the structure of a zim file, there's still much work to do, but I think I can dig deeper and write a better and more complete implementation if given enough time when participating in GSoC.

Darkcoder011 commented 1 year ago

Hey all, I'm Abhijit Dengale (2nd year B.Tech in CSE), I am here because i want to take part in this project I already solve more than 90 DSA problem in C++ and I have work on linux more than a year soo can I take part

@rgaudin

lyc8503 commented 1 year ago

Hi, I have submitted my proposal on GSoC platform and I am willing to hear any suggestions and improve it. I am not very familiar with Slack but I will check messages there regularly. I am wondering what's the preferred way to get in touch with a mentor, or should I just wait for a mentor to contact me?

juuz0 commented 1 year ago

I've submitted my proposal for this on the GSoC website :>

opk12 commented 1 year ago

libarchive is a well-known compression library, supported by a number of free software. archivemount is a FUSE filesystem based on libarchive.

What about integrating with libarchive? The user would be able to

rgaudin commented 1 year ago

That's an interesting possibility. Can you share examples of such libarchive-using softwares?

opk12 commented 1 year ago

@rgaudin A quick look at libarchive's website gives LibarchiveUsers where you can find arch linux's pacman(!), gvfs, ark. Any regular GNU/Linux user will recognize more than a few names in the 88 packages listed by running apt-cache rdepends libarchive13 on the current Debian stable (bullseye). Others are on the Internet but not in Debian.

rgaudin commented 1 year ago

Well, I don't see any in this list that would benefit more than just a mounted fuse fs. I thus don't see the value for libzim.

FUSE module are very simple and easy to distribute/deploy and use. Integrating into libarchive would definitely be more difficult and more importantly, we'd be on the libarchive release schedule which would be less flexible.

Just my uninformed opinion though.