okfde / froide

Freedom Of Information Portal
MIT License
356 stars 86 forks source link

"Froide more modular" wishlist #210

Open pdehaye opened 7 years ago

pdehaye commented 7 years ago

If Froide was made more modular, including to support new use cases outside of FOI, what goals should be aimed for? The thread here is meant to be a dump of ideas. There will be a separate effort to seek commonalities, condense them into something realistic and to seek funding.

New use cases:

Current active FOI instances:

pdehaye commented 7 years ago

Froide could be generalized to be a tool helping individuals petition organizations, as structured by laws.

For the Personal Data request service, here are a desirable changes (not saying they should all fit into this modularization of Froide).

  1. Some of the word choices appearing in the model names are too specific, for instance "PublicBody" or "FOI". The most appropriate word for that use case would be "Data Controller" and "Data Protection". In general a common wording could be "Organization" and "Law". Not Earth-shattering but a hindrance for further work.
  2. When doing a request to a specific organization for their personal data, there needs to be an additional identification field, specific to the pair organization/individual. For instance, my GitHub handle for a request to GitHub, or my phone number to my phone company. This has consequences also for the automated anonymization step. This can be thought as an additional, specific, signature.
  3. It would make sense to be able to clone requests as for their content (i.e. not the signature, or the specific signature)
  4. It would be desirable to construct campaigns, or meta requests. One click would generate many separate requests (say to all the transportation providers in an area).
  5. There needs to be more security around attachments, as there is currently no access control.
fin commented 7 years ago

just to correct you here on (5): there is access control on attachments to prevent users from accessing unpublished attachments.

pdehaye commented 7 years ago

Thanks for the comment. From my understanding, if using AWS as described in the documentation, the URL for attachments can be guessed and is not protected, at least prior to redaction.

stefanw commented 7 years ago
  1. Yes, this will be the most painful step, but I guess it is necessary. Not sure if necessary upfront, though possibly.
  2. Tricky, not sure if this is should be a core thing or more a plugin. Requires an additional model.
  3. We have run campaigns in Germany (see this request for an example) where many people make the same request. We also have an app that lets users requests from a list of templates. We also have plans for one person making one request to many authorities within one step. However, this mostly becomes a problem of UI (selecting the right authorities, managing of all the requests by the user). We have plans to introduce a grouping through a "project" that can have many requests and also conveys the overall intent of all the requests.
  4. see above.
  5. The URLs of attachments should indeed be non-guessable and also better sharded. I did not have S3 in mind during the initial design. We currently use the nginx X-Accel-Redirect settings with this view that handles authorization.

In general I would like to refrain from adding features and rather move 'business logic' out of froide and make it more pluggable (there's a small implementation of a hook-system already in place).

Identifying what the bare bones of the application is and stripping it down to that while maintaining the features in separate apps for the existing websites will be the toughest part.

stefanw commented 7 years ago

Also, we should consider that there are other Django-based FOI software projects out there, namely MuckRock and FOIAMachine. I don't want to push for a grand unifying theory just yet, but ideally at some point maintenance of some components could be shared.

We have to realise that for FOIA portal providers there's no benefit in running a more modular software except if the reduced maintenance cost justifies the porting effort.

pdehaye commented 7 years ago

@stefanw It looks like MuckRock and FOIAMachine just merged, at least at the level of code, and MuckRock just went open source.

stefanw commented 7 years ago

@pdehaye I know, that's why I mentioned them. 😉 We know each other and are in contact, but I believe our agendas are driven mostly by our FOIA mission and not necessarily about creating the most versatile piece of public email software. Maybe you can ask @morisy if he believes your project may be easier to implement on top of or evolving from muckrock code?

Also if you don't know Alaveteli, check them out. Their code should be more stripped down and possibly supports your use case quite well.

ryankanno commented 7 years ago

This is the start of a really great discussion.

We've customized quite a few bits of Froide out here in Hawaii - just to get my two cents in - As someone who's working on a fork, if there was more leveraging of the hooks/signals infrastructure as @stefanw alluded to, it would be time and effort well spent.

We had to fork and modify quite a bit of the core code (but would have been able to just wire into pre/post signals if they were available).

pdehaye commented 7 years ago

(tagging @loleg who also has experience setting up a Froide instance)

stefanw commented 7 years ago

@ryankanno honestly, I had no idea. It would be helpful if you could describe the places where you would need those hooks.

As I understand it, signals can only notify you of events, but don't let you influence main program flow. That's where I see hooks coming in (if anyone can recommend a Python library that does that better, let me know).

Just a reminder: the blessed way to customize a froide instance is to have your own Django project and use froide as a dependency (see this project). That way you can overwrite settings, URLs (and through that views, see here) and (over)extend templates (e.g. like done here, using django-overextends).

Clearly that doesn't help with control flow and main business logic. That would be the harder part 😉. Maybe we need a flow chart of the business logic with points where signals and hooks can be invoked.

ryankanno commented 7 years ago

@stefanw After we launch in a few weeks, I'll come back with where the hooks might have been useful. For deadline's sake, we just forked and modified the code. I hope that statement didn't come across as critical - I think the work y'all have done is amazing and saved us a ton of time to boot. :)

With that said, I think a lot of the customizations / business logic that aren't totally aligned with the core project could be implemented via pre / post signals around all the important business logic junction points. For a concrete example -> one of the things we're doing here is auto-creating a pdf document that the State of Hawaii uses internally to track FOI requests. To do this, we're wired into the FoiRequest.request_created signal, but in order to modify the email that actually gets sent out, there wasn't an obvious way to attach the auto-created pdf without touching the code implementing the send function.

I'd have to think through the implementation a bit, but If that was a signal that defaulted to Froide's implementation, but could be disabled/overridden by a downstream via config, we'd be able to define the behavior in our own projects without getting too into the weeds of the core Froide code. There probably needs to be a bit of thought put into this. :)

As a note, we did create our own project using the Froide theme project (very slick implementation, btw).