Is your feature request related to a problem? Please describe.
Currently each format may or may not include custom logic for determining programmatically the name of a file they annotate.
while the crowsetta.formats.seq.simple.SimpleSeq.from_file method has a notated_path parameter that defaults to None, and that it does nothing with, although the underlying attrs class applies a converter to make it a pathlib.Path if its not None.
Describe the solution you'd like
It would be nice if format classes could declare / expose functionality for converting annot_path -> annotated_path
One reason to do this would be to just make it an explicit part of the "API", so to speak, instead of having it hidden inside some of the from_file functions.
Another reason to do this would be to make it easier (possible) for other libraries to leverage this functionality.
E.g., vak has a map_annotated_to_annot function that could just use each format's classes function to do the mapping, instead of the current spaghetti-code logic.
Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.
I can think of a couple ways to achieve this. May not be mutually exclusive.
have a property / method that does this; e.g. if annotated_path were a property that encapsulated the functionality for converting from annot_path to annotated_path.
allow the annotated_path argument to the from_format method to be a Callable, in addition to being a path itself
This would let a user override the default behavior by passing in the callable. Downstream libraries could also leverage this functionality; e.g. vak could let a user specify in the config the name of a function to_annotated_path and then it would pass this in, in place of the default, just as a user might, when mapping annotations to the paths of the files they annotate
Where this might get complicated is when a single annotation file contains annotations for multiple annotated files. In that case, it does not make sense to determine the annotated_path from the annot_path; the annotations themselves must contain the path to each file they annotate, so it is clear which annotation corresponds to which file. I guess the way to handle this is to just not have any annotated_path function parameters or class properties for these format classes. Downstream libraries (e.g. vak example above) will need to check for the annotated_path attribute and decide what to do if they don't find it. Might make sense in that case to alternatively have an annotated_paths attribute that returns all the paths?
Is your feature request related to a problem? Please describe. Currently each format may or may not include custom logic for determining programmatically the name of a file they annotate.
For example,
crowsetta.formats.seq.notmat.NotMat
does: https://github.com/vocalpy/crowsetta/blob/28fd13613c3d08d0592ca522b86b87e669efd3b8/src/crowsetta/formats/seq/notmat.py#L82 inside itsfrom_file
methodwhile the
crowsetta.formats.seq.simple.SimpleSeq.from_file
method has anotated_path
parameter that defaults toNone
, and that it does nothing with, although the underlyingattrs
class applies a converter to make it apathlib.Path
if its not None.Describe the solution you'd like It would be nice if format classes could declare / expose functionality for converting
annot_path
->annotated_path
One reason to do this would be to just make it an explicit part of the "API", so to speak, instead of having it hidden inside some of the
from_file
functions.Another reason to do this would be to make it easier (possible) for other libraries to leverage this functionality.
E.g.,
vak
has amap_annotated_to_annot
function that could just use each format's classes function to do the mapping, instead of the current spaghetti-code logic.Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.
I can think of a couple ways to achieve this. May not be mutually exclusive.
have a property / method that does this; e.g. if
annotated_path
were a property that encapsulated the functionality for converting fromannot_path
toannotated_path
.allow the
annotated_path
argument to thefrom_format
method to be aCallable
, in addition to being a path itself This would let a user override the default behavior by passing in the callable. Downstream libraries could also leverage this functionality; e.g.vak
could let a user specify in the config the name of a functionto_annotated_path
and then it would pass this in, in place of the default, just as a user might, when mapping annotations to the paths of the files they annotateWhere this might get complicated is when a single annotation file contains annotations for multiple annotated files. In that case, it does not make sense to determine the
annotated_path
from theannot_path
; the annotations themselves must contain the path to each file they annotate, so it is clear which annotation corresponds to which file. I guess the way to handle this is to just not have anyannotated_path
function parameters or class properties for these format classes. Downstream libraries (e.g.vak
example above) will need to check for theannotated_path
attribute and decide what to do if they don't find it. Might make sense in that case to alternatively have anannotated_paths
attribute that returns all the paths?Additional context Related issues and discussion here on
vak
: https://github.com/vocalpy/vak/issues/563#issue-1341736666