seedcase-project / seedcase-sprout

Upload your research data to formally structure it for better, more reliable, and easier research.
https://sprout.seedcase-project.org/
MIT License
0 stars 0 forks source link

Investigate into converting code into reusable Django app #479

Closed lwjohnst86 closed 2 days ago

martonvago commented 1 week ago

Some project structure suggestions:

Tripartite structure, separating out core logic, web front end and command-line interface (using e.g. click): FROM

config
accounts
sprout
manage.py
pyproject.toml

TO

config
accounts
    core: logic, models, migrations
    web: templates, views
    cli: commands
sprout
    core: logic, models, migrations
    web: templates, views
    cli: commands
manage.py
pyproject.toml
    register entry points here

Alternatively, we can use custom django-admin commands for the command-line interface. There’s a click-based library for this. This would mean defining our commands in e.g. core/management/commands.

Main challenge (that I’ve run into): We want CLI commands to be accessible directly in the terminal (e.g. > my-command --option 3) when Sprout is installed, so we have to add our commands as entry points in pyproject.toml. This is a more complex setup than using custom admin commands (e.g. python manage.py my-command --option 3). When the app is accessed directly through a command, so not via manage.py, Django has to be set up manually, including pointing to the settings file, running django.setup() and running the migrations. Ideally (perhaps), these setup commands would be run once in a sort of post-install step, not before every command.

I’m not 100% sure what the best strategy would be for this, some possible options:

lwjohnst86 commented 1 week ago

Wonderful work!

For the challenges, what if we don't build the core logic using Django? So just regular Python? And in the web section, include the code from the core logic? I know we lose access to the ORM, but is that necessary in the core logic? For instance, many of the Models we really only use in the context of the web, right? Like DataType, Table, etc. Especially if we revise the code to use frictionless standard, we can avoid relying on the ORM. So the migrations and models could be moved to the Django web folder?

martonvago commented 1 week ago

Hmm, I guess I’m struggling a bit to picture how exactly this setup would look like. πŸ˜…

Ignoring user management for now, I can imagine having a database in the core application that is managed independently of Django. Then we could point the Django web app to this core database and add Django models to match the schema. But then migrations (using some other library) would need to live in the core app because we would want this to work independently of the web app. This way we would avoid having to set up Django just for running CLI commands, but the database would still need to be initialised before commands can be run.

Or are you thinking that with frictionless we wouldn’t need a database in the core app? I can see that much of our functionality could be implemented with data and metadata stored simply as files, but if we wanted e.g. data to be editable (like in the admin panel) or autogenerated audit logs, wouldn’t that be lots of extra work to implement?

lwjohnst86 commented 1 week ago

Hmm, good questions. I guess I also struggle to see how using the migrations would work in the CLI setup πŸ˜† We'll see as we build it I guess!!

But yea, I don't know if it would be necessary to deal with that stuff if we use the frictionless approach. And creating a tool to edit the data isn't exactly part of our design even now, so I don't think that's important. Unless you mean data generically (e.g. project data)

lwjohnst86 commented 1 week ago

What do you think about instead of web for the name, we call it app to be more generic?

martonvago commented 1 week ago

I'm happy with app!

martonvago commented 6 days ago

Based on our discussion yesterday, here’s an updated version of the high-level folder structure with some of the more important files. In this version it’s assumed that core and cli do not have a database and do not rely on Django. Data and metadata files are stored on the disk as part of core, following the frictionless standard and serving as the β€œsource of truth”. app is a Django app with a database, where the entries for metadata and data files are paths pointing to these files on the disk.

I’m less sure about the structure of accounts. If user management is only needed when Sprout is deployed on a server then accounts can just be a simple Django app. If we want user management when Sprout is used on the command line as well then the situation becomes more complex.

seedcase-sprout/
β”œβ”€ config/
β”œβ”€ persistent_storage/
β”œβ”€ staticfiles/
β”œβ”€ sprout/
   β”œβ”€ core
      β”œβ”€ csv/
      β”œβ”€ validators.py
      β”œβ”€ fixtures/ (or appropriate sample data)
   β”œβ”€ app
      β”œβ”€ migrations
      β”œβ”€ models/
      β”œβ”€ views/
      β”œβ”€ templates/
      β”œβ”€ static/
      β”œβ”€ forms.py
      β”œβ”€ uploaders.py
   β”œβ”€ cli
      β”œβ”€ commands/
β”œβ”€ accounts/
      β”œβ”€ default Django folder structure
β”œβ”€ database.env
β”œβ”€ manage.py
β”œβ”€ pyroject.toml
...
lwjohnst86 commented 2 days ago

I think it would be nice to incorporate this into the design docs at a high level. But let's do that after trying it out and seeing how it works.