jacebrowning / datafiles

A file-based ORM for Python dataclasses.
https://datafiles.readthedocs.io
MIT License
198 stars 18 forks source link

Fields declaration order really matters when using multi-variables pattern #334

Open devlounge opened 1 month ago

devlounge commented 1 month ago

It looks to me that the order in which the dataclass fields are declared is super important.

Example directory structure:

environments
  dev
    config
      servers
        server-1
          server.yaml
        server-2
          server.yaml
        server-3
          server.yaml

Datafile definition:

@datafile("./environments/{self.environment}/config/servers/{self.name}/server.yaml")
class Server:
    name: str
    environment: str

When running:

import os

from manager.models import Cluster

if __name__ == '__main__':

    clusters = list(Server.objects.all())

It ends up raising:

FileNotFoundError: [Errno 2] No such file or directory: '/project/configuration/environments/server-1/config/servers/dev/server.yaml'

This seems to come from when the manager.all method does yield self.get(*values) with values ["server-1", "dev"] which ends up building the path posted above as manager.get iterates on the fields in the order they are declared and sets the values accordingly.

If I declare:

@datafile("./environments/{self.environment}/config/servers/{self.name}/server.yaml")
class Server:
    environment: str
    name: str

It works.

jacebrowning commented 1 month ago

Thanks for the clear steps to reproduce!

There may have been a reason it was implemented this way, but if someone wants to try to convert the parse() results into keyword arguments that may make this more robust:

https://github.com/jacebrowning/datafiles/blob/59566a650f2624f2fc214bd9d8b3c5e7d825da57/datafiles/manager.py#L137-L150

jacebrowning commented 1 month ago

Alternatively, datafiles could raise an exception when the field ordering does not match the pattern.

Or we simply call out this gotcha in the documentation.

I'm open to suggestions!