libappimage 0.2.0 proposal

azubieta commented 5 years ago

LibAppImage is the foundation of all AppImage tools, therefore, it should provide a consistent and functional API for all of its clients. Currently, its state is not the best among the most prominent issues we found:

duplicated code
unnecessary (utilities) entries at the public interface
mixed concerns: file examination and desktop integration
missing file examination functionalities (list files, extract files, appimage files traversal)
dependencies included as submodules. Not all libs are made to be used as submodules but every lib is made to be used in the traditional way (installed in the system). Using submodules also temp to the developer to use the inner sections of the lib which are not expected to be consistent between version, which will produce build fails and increase the maintenance work.

As a major modification of the public interface and the inner implementation is required it would be a good idea to use this opportunity to improve the code-base and ease maintenance by rewriting it in modern C++ .

It's purposed to split the current libappimage.so in two different dynamic shared objects (DSOs): libappimage and libappimage-desktop-integration. Both will be produced from the same code base (the libappimage project). Besides the DSO version, there will be a static version with will have zero external dependencies in order to allow its usage by third-party projects that don't want to depend on the whole libstd++. The public headers files will be kept in bare C initially also to increase compatibility, an additional C++ interface could be published later.

The libappimage module will be responsible for: reading AppImage files information and contents extraction.

The libappimage-desktop-integration module will be responsible for: performing the desktop integration and disintegration of applications packed into AppImages.

azubieta commented 5 years ago

@probonopd @TheAssassin comments are welcome.

probonopd commented 5 years ago

I see two (somewhat related) proposals here, namely

To use "modern" C++
To split one .so into two

Use "modern" C++

C++ is mainly used in the KDE/Qt/... world whereas system level tools are mainly written in C, possibly using glib. Since we want to be desktop agnostic, I tried to write AppImageKit/appimagetool in C using glib. There is even a more basic version without glib for those who do not like that much framework use.

Especially "modern" C++ usually means trouble because it pulls in "modern" libstdc++, resulting in breakage for systems like CentOS 6 and the like.

The most "modern" C++ one should develop against at this point is the one that comes with CentOS 6 or Ubuntu 14.04. (At least that is what we are telling third-party application authors.) And I like to eat my own dogfood.

Split one .so into two

I fail to understand how breaking up something into smaller pieces solves the issues mentioned. In fact, it increases (perceived) complexity and I fight things like increasing the number of files, increasing the number of repositories, increasing the number of dependencies wherever I can. But all we seem to get all the time is ever-increasing complexity. Let's keep things simple!

azubieta commented 5 years ago

About "modern" C++ in order to keep being desktop agnostic, we should have 0 external dependencies and this could be achieved by statically linking libstdc and libstdc++. This could sound like we will have large binaries but with the proper linker instructions, all the code sections that are not used can be removed from the binary. Also, the main interface will be pure C so all kind of desktop environments will be able to use it. Those C++ friendly desktops environments could use the non-statically linked version.

It's important to say that our libs will keep being shared libs, they will only embed the required sections of libstdc and libstdc++.

Split one .so into two The reason for this is to not mix desktop integration logic with AppImage reading logic (see separation of concerns). So client applications that need only to extract a given file will not have to link to the whole blob. The public APIs will be kept minimal wich ease maintenance and reduce backward compatibility issues. Also, like the desktop integration code is more likely to change than the AppImage reading code the client apps that only link libappimage will not be affected by changes in the other.

probonopd commented 5 years ago

It's not just about technically having no dependencies, it's also about GNOME people not wanting to read C++ code. If the code is C and GLib, it feels "natural" to them. Just sayin'.

I never get this "separation of concerns" thing. In contrast, I want "all things Audacity" in one file, and "all things AppImage" in one shared library. Everything else feels just more complex to me.

TheAssassin commented 5 years ago

@azubieta please use libappimage 1.x as the new version number, as this would be a breaking change.

probonopd commented 5 years ago

1.x implies a stable API/ABI, which I think we don't have

TheAssassin commented 5 years ago

It's a breaking change... well, if you insist, we can use 0.2.0, and then, once it's stable, release 1.0.0.

probonopd commented 5 years ago

Yes, that's more like I read https://semver.org/#how-do-i-know-when-to-release-100

TheAssassin commented 5 years ago

About "modern" C++ in order to keep being desktop agnostic, we should have 0 external dependencies and this could be achieved by statically linking libstdc and libstdc++. This could sound like we will have large binaries but with the proper linker instructions, all the code sections that are not used can be removed from the binary. Also, the main interface will be pure C so all kind of desktop environments will be able to use it. Those C++ friendly desktops environments could use the non-statically linked version.

How libappimage is built and distributed is none of our concern. We only need to decide how it's shipped in our products. And as we use AppImage for most distribution purposes and never met any desktop distro that does not ship the C++ STL implementation, why should we have to worry about that now?

libstdc++ etc. are available on GNOME(-based) distros as well as on others. Why are we even talking about this?

It's important to say that our libs will keep being shared libs, they will only embed the required sections of libstdc and libstdc++.

Not sure what you mean by that...

Split one .so into two The reason for this is to not mix desktop integration logic with AppImage reading logic (see separation of concerns). So client applications that need only to extract a given file will not have to link to the whole blob. The public APIs will be kept minimal wich ease maintenance and reduce backward compatibility issues. Also, like the desktop integration code is more likely to change than the AppImage reading code the client apps that only link libappimage will not be affected by changes in the other.

I highly agree with this. It's better software design, but if that doesn't convince you already, it also allows for creating single libraries within the ecosystem of libappimage so that they have minimal dependencies. That means users who only need some way to recognize AppImages reliably don't need to link to (and ship) the libraries needed to actually open AppImages and extract files.

I don't buy the API compatibility argumentation completely, as we will have to handle this by doing proper versioning anyway. But generally, it'd be nicer to split the codebase into smaller, easier-to-handle-and-maintain amounts of code. That makes reading and understanding the codebase a lot easier, also testing will be much easier, as metrics like coverage data etc. can be generated and evaluated much easier.

It's not just about technically having no dependencies, it's also about GNOME people not wanting to read C++ code. If the code is C and GLib, it feels "natural" to them. Just sayin'.

They had their chance to get in touch with us regarding desktop integration, I'd say. Sure, we're open to suggestions and shouldn't perform any actions that will break compatibility with them. But compatibility doesn't mean "restricting us to use their libraries". Do KDE/Nitrux/... folks complain libappimage is linked to GLib? So far, nobody has complained about that. Why should the GNOME folks be different here, if we offer them a C-compatible ABI?

I never get this "separation of concerns" thing. In contrast, I want "all things Audacity" in one file, and "all things AppImage" in one shared library. Everything else feels just more complex to me.

Complexity is of course really bad. But trust me, what we had before in this AppImageKit repository was way worse in terms of structuring.

We had extremely low cohesion in the code base. As @azubieta mentioned (elsewhere?), we've had duplicate code that attempted to perform the same actions, but one implementation worked slightly differently, leading to annoying debugging sessions, etc.

We had, however, very high coupling in the code base. #include "shared.c" etc. That might've worked for a single project (no, it wasn't acceptable really, but as said, it did the job), but surely caused some really bugs, and made some really time-consuming decoupling necessary. Otherwise, we'd have needed to create a second implementation anyway, and that leads us back to having duplicate code. In fact, we still have some duplicate code, as I had to implement quite some functionality that existed nowhere else in AppImageUpdate to be able to implement the feature set (the pre-rewrite AppImageUpdate either called the AppImage in question, or had some "implementation" in bash script using tools like dd or so).

We're been making some great progress to improve the issues in libappimage, but as all previous improvements ended up in a rewrite of said functionality anyway, so we decided to start a real rewrite in C++. C++ helps us program a lot more efficiently, and by being able to use all those data types, it helps write more secure code as well. And that's something we really should aim for, as we ship tools for end users, and also especially since external projects start to use our code.

We're trying to improve the overall code quality in the AppImage ecosystem, following established methods that are used with great success in the world of software engineering and development. We're not making anything more complex, quite the opposite, actually. We're actually adding some better structure to the code base, avoid duplication/redundancy/maintenance overheads/..., attempt to implement loose coupling and high cohesion, and end up in a better, more efficient and more secure project.

Now, please tell me again why this proposal is a bad idea, @probonopd.

probonopd commented 5 years ago

C++

Don't get me wrong, here is a reason why I was using C++ and Qt for e.g., linuxdeployqt (a high-level tool the success of which is not determined by how pleasing it is for GNOME developers).

Low-level tools are a different story. I think we should do those in a way that is generic enough (not just technically - also from a mindset perspective) that they can have a place on every Linux desktop, be it a KDE-style or a GNOME-style desktop. I came to the conclusion that using C is the way to go.

Do KDE/Nitrux/... folks complain libappimage is linked to GLib? So far, nobody has complained about that. Why should the GNOME folks be different here

Since writing in plain C is really cumbersome (and time is scarce) I already "compromised" by allowing GLib (as it is used by the GNOME camp anyway and seems to be acceptable also in the KDE camp). The GNOME camp already now is very hesitant to embrace AppImage concepts, I think that writing code in a language that feels "alien" to them will not help the cause.

Actually, let's bring in some evidence.

freedesktop.org is open source / open discussion software projects working on interoperability and shared technology for X Window System desktops. The most famous X desktops are GNOME and KDE, but developers working on any Linux/UNIX GUI technology are welcome to participate.

freedesktop.org is building a base platform for desktop software on Linux and UNIX. The elements of this platform have become the backend for higher-level application-visible APIs such as Qt, GTK+, XUL, VCL, WINE, GNOME, and KDE. The base platform is both software and specifications. (...) freedesktop.org hosts any "on-topic" software projects.

Let's check what this software is written in:

https://gitlab.freedesktop.org/freedesktop/freedesktop/issues/2#already-migrated-to-gitlab

Split one .so into two

Let's look at two aspects:

How the source code is organized internally

Here we started from one repository that held all source code. it was easy to share code among the various parts by just including one source file. (All the mainline Linux drivers are in the same repository as the Kernel, for what I assume to be similar reasoning about simplicity and avoiding dependencies.)

When we started to spread things out across multiple repositories, things became much more complicated.

I understand that a library was wanted that can be used not only by the AppImage project but also by external "consumers" like desktop environments, so moving the library out was probably the right thing to do, even if it came at a substantial complexity cost.

Organizing the source code in a maintainable way so that we have minimal to no code duplication is of course worth the refactoring effort.

How the code is shipped in binary form

How does having two .so files simplify anything? The only argument I can see is that some users may want to use AppImageKit functionality without desktop integration functionality. Still, I think we should have the entire functionality in one library, since otherwise we introduce a dependency between the two libraries which must be kept in sync. Let's keep things simple! (For example, I always hate it when I want to install a third-party deb on Debian/Ubuntu and this third-party deb has dependencies outside of the default repository, factually forcing me to add a third-party repository. Seriously!)

azubieta commented 5 years ago

C++ Our main public interface will remain just C and we will provide a standalone build that will not depend on libstdc++. Therefore projects like Gnome will be able to use it without any issue. The unused parts of the libstd++ will be also stripped. Finally, our binary will be standalone, small an portable as required.

Split one .so into two Both .so can be built from the same source project so the overall complexity will not grow at all.

probonopd commented 5 years ago

Our main public interface will remain just C

C programmers won't touch C++ code, I fear

Both .so can be built from the same source project s

Yes, but what do we gain?

azubieta commented 5 years ago

C programmers won't touch C++ code, I fear

I know there are people who hate C++, the point is that they will not see C++ part. You know what? I would be great to get in touch with the Gnome guys and ask them if they have any issue with this approach.

Yes, but what do we gain?

The only argument I can see is that some users may want to use AppImageKit functionality without desktop integration functionality

Exactly, this also means that they will have fewer API breakages as the desktop integration API is more likely to change than the AppImage reading API. Making our users work less is an important thing IMHO.

probonopd commented 5 years ago

You know what? I would be great to get in touch with the Gnome guys and ask them if they have any issue with this approach.

+1

Exactly, this also means that they will have fewer API breakages as the desktop integration API is more likely to change than the AppImage reading API.

Assuming that one version of the desktop integration API will be tied closely to the matching version of the AppImage reading API, why not have them in one so.

TheAssassin commented 5 years ago

@azubieta just don't worry about this. They are free to implement their own code if they are unhappy with what they can find here...

@probonopd why should that be necessary if one depends on the other? Monolithic solutions are always inferior... modularity ftw!

TheAssassin commented 5 years ago

C++ Our main public interface will remain just C and we will provide a standalone build that will not depend on libstdc++. Therefore projects like Gnome will be able to use it without any issue. The unused parts of the libstd++ will be also stripped. Finally, our binary will be standalone, small an portable as required.

Why? Why not make a pretty C++ interface, and provide a pure-C wrapper to it? Like, writing classes that encapsule the actual functionality and can be used from C++ directly? From a C interface, one can simply re-instantiate those classes on every call, which will be slightly less efficient, but that way round, it's a much more sensible design approach.

azubieta commented 5 years ago

Why? Why not make a pretty C++ interface

We will have such an interface but not as a first-class citizen. A C++ interface is a bit more difficult to maintain and we will always need a C interface to enhance compatibility. So I would like first, to replace the current interface implementation. Once finished proceed to expose the C++ interface.

TheAssassin commented 5 years ago

That's a silly design decision, to be honest. Crippling C++ down to a C-style interface for people who don't even exist right now.

Nobody's requested a C-style interface. We only need it in the long term. A pretty C++ class-based interface will make libappimage a lot more accessible than any C interface, and more usable for people who use libappimage now, which is mostly C++ applications.

azubieta commented 5 years ago

You are ignoring the fact that C interfaces are supported by almost every programming langue. By not providing such interface we will limit the usability which is something that we definitively don't want.

TheAssassin commented 5 years ago

@azubieta not really. I said it's an important feature, but only in the long term. Such an interface could be built as a light wrapper just fine...

azubieta commented 5 years ago

I just had a chat at the #gnome-hackers irc channel with @ebassi about the C++ issue. They seem to not have any issues with plugins that are linked to libstdc++. As the thumbnailers are plugins and the Nautilus file manager also accept plugins we can proceed.

probonopd commented 5 years ago

Thanks for checking with them @azubieta. What does a C++ switch mean in terms of the resulting binary size?

TheAssassin commented 5 years ago

Shouldn't bloat it too much, as libstdc++ can be used from the system. Runtime wise, if designed properly (and yes, I consider using C++ classes and all the features bound to that to be proper, C-style stuff is not), we should be able to see some benefits, too. @azubieta is building some nice C++-ish interfaces on which we might even be able to use the STL's algorithms etc.

Also, the amount of redundancies in the old code compared to the new code is expected to be be significantly lower.

probonopd commented 5 years ago

If they accept such plugins it doesn't mean that they will contribute and embrace though. (It could also be "we don't care about AppImage anyway, and we won't include the plugin in stock GNOME anyway no matter how much you try, so it doesn't matter what it is written in") - my question would be, what do we need to do in order to become a loved, embraced, true first-class citizen inside GNOME Files a.k.a. Nautilus.

azubieta commented 5 years ago

@probonopd I also asked for such kind of things, the answer I got was something like "we don't have an upstream plugin collection" and they don't (check https://gitlab.gnome.org/GNOME/). So we will have to support such plugins :disappointed: and is about distributions to decide if they want our software in or not.

About becoming loved, I guess that Gnome is the wrong target. We should become loved by the applications users and developers they will push Gnome and the rest to adopt AppImages. I don't think it would work otherwise (consider that they already have Flatpack).

probonopd commented 5 years ago

consider that they already have Flatpack

which I rather consider complementary than competing because

It is not one-file drag-and-drop in the file manager, it's rather a substitute for rpm
It is not "portable applications", it's rather a substitute for installed applications

We should become loved by the applications users and developers

Full ack, that is our audience. Gnome is the gatekeeper that stands between us, developers, and users...

probonopd commented 5 years ago

Some things just never seem to change:

Darin Adler has been programming computers since 1976. He loves to do it. His first major professional experience was at Apple Computer. In 1988 he led the team that rewrote the Macintosh Finder in C++ (...) and helped start Eazel, a company that worked to make Linux easier to use and developed the Nautilus graphical shell for GNOME. (...) The other people working on the GNOME project don't like C++, so he's writing a lot of C code these days. (2001)

https://www.boost.org/users/people/darin_adler.html

He later went back to Apple and was instrumental in making WebKit and Safari, both cornerstone technologies for Mac OS X and the iPhone, which interestingly started from a KDE KHTML codebase and is written in, you guessed it, C++.

(I have written about Eazel and the steady decline of the desktop since then here, and there is more about Safari straight from the source here.)

azubieta commented 5 years ago

As the proposal is now under development we can close the issue. Tracking will continue on #33

AppImageCommunity / libappimage

libappimage 0.2.0 proposal #26

Use "modern" C++

Split one .so into two