But now we no longer make individual Segments in the 'generic' format (that replaced 'csv').
Instead we make entire Sequences using the from_keyword instead, parsing csv files with pandas to avoid the maintenance work of doing it ourselves: https://github.com/vocalpy/crowsetta/blob/f95b08ae7556b336d0e99044df0947a0060163aa/src/crowsetta/formats/seq/generic.py#L265
(We grab a sequence at a time from the dataframe, and then get arrays from each column 'onset_s', 'offset_s', etc., via the values attribute.)
But then these converters are doing something extra weird, looking for a string "None":
def float_or_None(val):
"""converter that casts val to float, or returns None if val is string 'None'"""
if val == "None":
return None
else:
return float(val)
What's actually going on is that these are doing the work of a Segment.from_row method that can handle a row of strings from a csv. See this unit test for an example:
def test_Segment_init_onset_offset_in_seconds_from_row():
header = ["label", "onset_s", "offset_s", "onset_sample", "offset_sample"]
row = ["a", "0.123", "0.170", "None", "None"]
a_segment = Segment.from_row(row=row, header=header)
for attr in ["label", "onset_s", "offset_s"]:
assert hasattr(a_segment, attr)
for attr in ["onset_sample", "offset_sample"]:
assert getattr(a_segment, attr) is None
and the leftover logic for it is confusing.
This method was used by the now-removed 'csv' format, see for example here: https://github.com/vocalpy/crowsetta/blob/a47798ac54440cd3a974a8dccb83154ae4ea7d1f/src/crowsetta/csv.py#L224
But now we no longer make individual
Segment
s in the 'generic' format (that replaced 'csv'). Instead we make entireSequence
s using thefrom_keyword
instead, parsing csv files withpandas
to avoid the maintenance work of doing it ourselves: https://github.com/vocalpy/crowsetta/blob/f95b08ae7556b336d0e99044df0947a0060163aa/src/crowsetta/formats/seq/generic.py#L265 (We grab a sequence at a time from the dataframe, and then get arrays from each column 'onset_s', 'offset_s', etc., via thevalues
attribute.)I realized
Segment.from_row
is no longer used as I was trying to write examples for the Segment docstring as suggested here in the pyOpenSci review: https://github.com/pyOpenSci/software-submission/issues/68#issuecomment-1451660136We should just remove it and do the more intuitive thing with attrs:
attrs
optional
and have these default toNone
(NoneType, not a string "None")__post_init__
thing where we throw an error if at least one of them is not specifiedDumping a more detailed explanation of issues cause by
from_row
, from my dev notes: