Closed dbczumar closed 1 month ago
Thank you so much @dbczumar ! This looks great to me. I realized I had made a mistake in discussing the if statement that decides between formatting options. I corrected my error in there and updated the comments to reflect this.
Here's the new snippet:
if dspy_field_type == "input" or field_info.annotation is str:
# If the field is an input field or has no special type reqs, format as a numbered list
# so it's organized in a way suitable for presenting long context, i.e. not JSON.
return format_input_list_field_value(value)
else:
# If the field is an output field that has strict parsing reqs, format the value as
# a stringified JSON Array. This ensures that downstream routines can parse the
# field value correctly using methods from the `ujson` or `json` packages.
return json.dumps(value)
tl;dr All inputs can safely be formatted as numbered lists, since parsing is not a concern. Outputs with no parsing requirements (i.e., outputs that are specified as strings) also fall under that, i.e. format as numbered lists. Otherwise, we use JSON.
I would happily merge at this point but looks like we have plenty of failing tests. The tests failing are quite new (from #1595 cc @mikeedjones) so maybe we're learning something useful here about the changes that I missed in my review or about the tests.
Thank you so much @dbczumar ! This looks great to me. I realized I had made a mistake in discussing the if statement that decides between formatting options. I corrected my error in there and updated the comments to reflect this.
Here's the new snippet:
if dspy_field_type == "input" or field_info.annotation is str: # If the field is an input field or has no special type reqs, format as a numbered list # so it's organized in a way suitable for presenting long context, i.e. not JSON. return format_input_list_field_value(value) else: # If the field is an output field that has strict parsing reqs, format the value as # a stringified JSON Array. This ensures that downstream routines can parse the # field value correctly using methods from the `ujson` or `json` packages. return json.dumps(value)
tl;dr All inputs can safely be formatted as numbered lists, since parsing is not a concern. Outputs with no parsing requirements (i.e., outputs that are specified as strings) also fall under that, i.e. format as numbered lists. Otherwise, we use JSON.
Thanks @okhat ! I've incorporated your edit :)
I would happily merge at this point but looks like we have plenty of failing tests. The tests failing are quite new (from #1595 cc @mikeedjones) so maybe we're learning something useful here about the changes that I missed in my review or about the tests.
I think the test issues are caused by a bug in my change to DummyLM
, which should hopefully be fixed in my latest couple of commits. Fingers crossed! :D
Before merge, I'd like to introduce some test coverage to catch similar regressions in the future.
Fixes #1567