Closed jacebrowning closed 5 years ago
The simplest way would be to make use of **
recursive globs (See the glob()
docs for details.) But this is only available in Python 3.5+, and the docs say:
Note: Using the “
**
” pattern in large directory trees may consume an inordinate amount of time.
The two changes to be made to implement this:
GlobFormatter.format_field()
to return '**'
match()
to call glob.iglob()
with recursive=True
PSF is no longer releasing 3.4, and the distros shipping 3.4 are Debian oldstable (Jessie) and Ubuntu 14.4 (Trusty). I'm not sure what your personal feelings are on old Pythons.
I'm OK with requiring Python 3.5+.
~So, Umm... I tried this, and it's doing something weird. Investigating.~
There's a problem. The regex produced by the parse module (which I pulled in to reverse file paths based on the path_format
) produces the regex data/(?P<self_kind>.+?)/(?P<self_key>.+?)\.yml
, which is ambiguous.
Basically, we don't have knowledge about which keys could contain delimiters used, so it's difficult to extract fields from the path.
Yeah, obviously only one of the attributes can be allowed to contain arbitrary paths.
Could there be some way to indicate with attribute can be expanded? Or assume it's the first attribute?
parse
assumes non-greedy everything, so all the delimiters will end up in the last field.
The complete match()
to data sequence is:
match()
attempts to synthesize constructor arguments based on comparing the path_format
to the found file plus the given keyword arguments__init__()
are calledsync_instances()
generates a filename from the attributessync_object()
does it's thing and loads dataThere's a few problems with this:
path_format
, there's multiple ways to parse a path, a subset of which will match the given kwargs (it currently doesn't even try to find this subset)sync_object()
that will completely fubar data if match()
doesn't get the arguments exactly right. However, if match()
selects a set of arguments that produce the same filename, that codepath will be skipped (the mapper.missing
path) and the mapper will .load()
the data (which I think will correct any errors made by match()
).So that's the long-way around the conclusion that I think we just need a version of the parser (what parse
is doing now) that can handle this ambiguity and select a version that will load. Hopefully. (Step 2 above kinda leaves a whole lot of possibility for poor selection of arguments to hose everything.)
The other alternative is to refactor a bunch of things to support either:
Neither of these are options I'm really comfortable with handling.
I'm going to close this. My focus is now on the spiritual successor to this project: https://github.com/jacebrowning/datafiles
With https://github.com/jacebrowning/datafiles/issues/5, I plan to add similar functionality.
@astronouth7303 Here's a test that fails in the matching logic. Any thoughts on how to handle this?