Closed NickleDave closed 1 month ago
Changed the title of this issue to indicate we should just worry about SimpleSeq for now since this is the main case where I'm running into this issue.
But leaving the description above because I think we might care about this more generally
To fix we should do the following:
SimpleSeq.from_file
load the csv into a dataframeSequence
will also need to be able to exist as empty arrays, or we will need to complain loudly when someone tries to call to_seq
on an empty SimpleSeq
; not sure if Sequence
throws an error right now when we do that
Right now crowsetta does not have any logic for handle annotation files that have no annotations, i.e. that are empty.
This is kind of an edge case, since it's very likely that if you are already loading your annotations into Python from another tool you have specifically annotated what you are interested in.
But it can still happen: e.g. when creating clips of an existing dataset and re-making the annotations for the clips from an existing set of annotations, we can end up with a clip that has no segments in it.
Currently this raises a pretty inscrutable error from pandera.errors.SchemaError about "expected 'type' but got object" where 'type' will be the expected type for some column.
We should instead do a try-catch and check if the dataframe is empty (
df = pd.load_csv(annot_path); if len(df) == 0: pass
), and if it is, return an empty instance of the class. If not, re-raise the error.