openfoodfacts / openfoodfacts-server

Open Food Facts database, API server and web interface - 🐪🦋 Perl, CSS and JS coders welcome 😊 For helping in Python, see Robotoff or taxonomy-editor
GNU Affero General Public License v3.0
633 stars 372 forks source link

Re-architect Product Opener and re-implement it (or part of it) in Python #5170

Open stephanegigandet opened 3 years ago

stephanegigandet commented 3 years ago

It's been almost 10 years since I started to code Open Food Facts. The code has grown a lot since then, mostly organically, with a few refactors along the way (thanks a lot to everyone who took part in it and made it happen!). So it may be a good time to revisit some things (like for instance how we store data), try to think of a better architecture (e.g. a better separation of backend and frontend, and of analyzing products and querying them), and possibly to change some technologies (e.g. the database).

We could also think about implementing some parts in another programming language with a bigger number of developers. Sadly my kids don't learn Perl at school, but they learn Python.

So this issue is to start the discussion and see how many developers would potentially be interested in working on a re-architecture / re-implementation of Product Opener (hopefully an incremental one that we can gradually deploy).

teolemon commented 3 years ago

That would be a good moment to do so, especially to onboard more people (myself included) for more meaningful contributions. This could intersect beautifully with the Folksonomy Engine (https://github.com/openfoodfacts/folksonomy_api), and we could cannibalize some of the stuff that's been developed over the years.

https://github.com/openfoodfacts/OpenFoodFacts-APIRestPython (using a Mongo Dump) https://github.com/search?p=2&q=openfoodfacts+python&type=Repositories https://github.com/klorophyl/openfoodfacts-django https://github.com/FilipeKN4/open-food-facts-challenge

wbraswell commented 3 years ago

I humbly suggest you reconsider the old cliche of "Perl is dead, let's port it to Python instead". We have many exciting new Perl projects, including the Perl compiler, the Perl Town Hall weekly webcast, and the long-awaited Perl version 7.

When my Austin Perl Mongers group came to meet with the Paris Perl Mongers a few years ago, we were very happy to learn about the Open Food Facts project and are still sporting the laptop stickers you gave us, all because of Perl!

I learned about this specific issue because it is being discussed in the Perl group on Facebook: https://www.facebook.com/groups/perlprogrammers/permalink/4213228082043281/

Here are a few comments from FB:

@jjn1056 said: "Seems to be a modperl app and pretty home grown. I would say that not Perl itself is the maintenance issue."

M.W. said: "This will, on past experience with “we must rearchitect in $TRENDIER_LANGUAGE”, not end well."

@zmughal said: "Might be a project [we] could contribute to in order to convince it not to just do a rewrite in another language because something feels popular?"

@stephanegigandet It would appear that your code can, in fact, benefit from an upgrade, although I see no reason to abandon Perl. May I suggest there may be value in considering a modern Perl framework such as Catalyst or Mojolicious or Dancer?

svensven commented 3 years ago

@wbraswell I imagine it's not just about being trendy, but about the practicality/likelihood of getting contributors to the codebase, especially from students/GSoC type projects/etc. Also, even reasonably popular CPAN modules seem to be starting to have maintenance issues/bit rot.

wbraswell commented 3 years ago

@svensven My suggestion of upgrading to a modern Perl framework will go a long way toward lowering the boundaries for new developers to join the project. The existing old-school Perl source code can be problematic for even experienced Perl programmers.

As for CPAN, I am a regular user and author of CPAN distributions, and I am not aware of any important CPAN modules suffering from maintenance issues??? Please provide specific examples.

svensven commented 3 years ago

@wbraswell Mongodb and JSON::XS are recent cases that spring to mind, but I remember seeing others with open bugs and no updates for years in my travels. I could also play the "show evidence" card and ask for examples of perl coders who would've contributed if only the code were arranged differently... :D Moving to a framework could actually be a useful middle-step in making the code more portable to another language, if it's reasonably conventional in its concepts.

wbraswell commented 3 years ago

@svensven

JSON::XS is just fine, perfectly alive and well: https://metacpan.org/pod/JSON::XS

The MongoDB driver was EOL'd by a corporation with it's openly-admitted concern only for their own bottom line: https://www.mongodb.com/blog/post/the-mongodb-perl-driver-is-being-deprecated

I do not believe the cold rationale and mercenary motivations of corporations should be forced upon humans and open source software projects. If the only thing we cared about was money and profits, then I assert the open source community (including Perl & Python & Open Food Facts) would never have flourished in the first place.

My "show evidence" request is based on the real existing software on CPAN. Your "show evidence" reference is based on a completely-hypothetical "if only" (your own words) scenario which does not exist IRL and is thus impossible to either prove or disprove.

I maintain that there will be no need to abandon Perl if an upgrade to a modern Perl framework is approved and implemented.

akuks commented 3 years ago

I am more than happy to contribute if modern Perl frameworks is approved.

stephanegigandet commented 3 years ago

@wbraswell I very much agree that the code can be improved a lot, in many different ways, and that more modern frameworks would make it much better. It's an effort we started a couple of years ago (starting with small things like better documentation, using Template::Toolkit, more tests etc.). We haven't yet migrated to one of the new frameworks, but hopefully we will.

But while it will certainly improve things and make life much easier for the current developers, I don't know how much difference it will make in terms of recruiting new developers to do all the things we would like to do. There are lots of people who are motivated by the idea of the Open Food Facts project - building a Wikipedia of food - but very few who have contributed to the Perl code. e.g. to give a sense of scale, we have close to 6000 participants on the Open Food Facts Slack, but only a dozen of people have made a contribution to the Perl code (dozens more have made contributions to this repo, but on the non Perl part like taxonomies). Certainly a lot of that is due to my poor programming practices, but as much as I love Perl, I can't deny the fact that almost none of the developers that join our Slack know Perl. For us it's very important that members of the Open Food Facts community can contribute as easily as possible in as many ways as possible. Our goal is not trendiness, it's to enable more members of our community to create or improve things they care about.

Maybe one way forward is to experiment: we can continue to improve the Perl code, and try to migrate to a new Perl framework, and in parallel, port one component (e.g. ingredients list parsing and analysis) to another language like Python, and hopefully we will get more contributors in both languages. Doing something like that would in fact force us to modularize and introduce a much cleaner separation of components, our monolithic Perl code base would certainly benefit a lot from it. That's in fact a major point of this discussion: re-architecting the components of the projects, and not swapping a language by another.

In any case, thanks a lot for your comments. The purpose of this issue is to discuss options, so it's great that you pointed new ones.

I'd love to also get the opinion of other contributors, like @hangy @Ban3 @svensven @VaiTon @aleene @AcuarioCat @jolesh @zigouras @CloCkWeRX @syl10100 @blazern @rbournhonesque @M123-dev @areeshatariq @CharlesNepote @roshnaeem . Would you contribute more (or less?) if our Perl code was cleaner / more organized / used newer frameworks? Same question if some of it was in another language like Python?

stephanegigandet commented 3 years ago

I am more than happy to contribute if modern Perl frameworks is approved.

Thank you very much @akuks !

M123-dev commented 3 years ago

Very interesting conversations here.

I have no experience with perl I can't say anything about the internal processes, the speed, the packages, but the syntax doesn't appeal to me. But with a bit coding experience this should not be hard to learn.

What prevented me from contributing was that I didn't know what was what. Not only in the code but on the entire website. I think we should keep perl in the long run, but a quick switch to a hype language especially something like python is not the solution either.

My suggestion would be that we split the off-server repo that has become very large over time. That would give us much more structure and when we then get to the point where we really need a rewrite, we can handle this in sections and choose different tools for different parts. Under split I imagine a server/backend repo with api and usermanagment and a frontend repo that only does visual tasks like the mobile apps.

CloCkWeRX commented 3 years ago

I think the points of friction for me, regardless of language; I perceive (which may or may not be true!) things as

I think the way robotoff is structured/integrated into the core site is a good path forward - it's separate and presents necessary APIs for a limited, specialty concern. Pretty much 'microservice done right'. So taking a very limited slice of the application and pulling it out to its own service would make sense to me. Examples like pushing all the search/list views onto elasticsearch? Maybe dedicated imagery resizing/processing/storage? The boring things which have off the shelf open source battle tested alternatives are all good candidates.

I would be strongly against a rewrite for the sake of rewriting.

mithridatea commented 2 years ago

I'm in favor of implementing some part of the codebase in other programming languages, to attract new contributors that don't know Perl and/or are intimidated by the codebase complexity. Ingredients list parsing and analysis is a very good first candidate to me: it's a critical piece of code that could benefit from the contributions of people familiar with grammar parsing and NLP. When possible, I also think that new projects that are not too strongly linked to ProductOpener (such as life-cycle analysis projects) should also be implemented as micro-services in other languages.

alexgarel commented 2 years ago

Just to add my grain of salt, I didn't read it all, but I also think that componentisation (not micro-component maybe) is a good idea.

I really think we should distinguish the web server / web api parts from the logic parts. For this to happen we need to untangle the perl code first, for in some part, it is really entangled. Encapsulation is important there. And not forgetting the single responsability principle. There are already good moves in this direction, but there is a lot to be done !

On my side, I think that an external component for taxonomies would be a good idea. It could eventually some kind of database instead of in-memories structures (and with an efficient query engine, we will gain performances thanks to indexes). It will however have to have a quite complete API, and we first need to make this API appear in current perl code.

Another part that could be externalized more easily is the user / auth system, and that's always a pain to have it bugless (and it's a pain to users).

There are other pieces that we may identify. Those could be good candidates for contributions in program like Google Summer of Code.

yuktea commented 2 years ago

Maybe one way forward is to experiment: we can continue to improve the Perl code, and try to migrate to a new Perl framework, and in parallel, port one component (e.g. ingredients list parsing and analysis) to another language like Python, and hopefully we will get more contributors in both languages. Doing something like that would in fact force us to modularize and introduce a much cleaner separation of components, our monolithic Perl code base would certainly benefit a lot from it.

This does sound like a good idea, I'm up for working on it with python or if a another framework of Perl is on OFF's plan. I'm very new to the project so that does give me a perspective of someone who wishes to start contributing. The project is meaningful and has the potential to build a larger active contributor's community.

alexgarel commented 2 years ago

As said @yuktea I think first work is to de-tangle the code a bit, so that we can go part by part !

I think we have to do an analysis with @stephanegigandet to have a clear plan on this.

github-actions[bot] commented 7 months ago

This issue has been open 90 days with no activity. Can you give it a little love by linking it to a parent issue, adding relevant labels and projets, creating a mockup if applicable, adding code pointers from https://github.com/openfoodfacts/openfoodfacts-server/blob/main/.github/labeler.yml, giving it a priority, editing the original issue to have a more comprehensive description… Thank you very much for your contribution to 🍊 Open Food Facts