hubverse-org / schemas

JSON schemas for modeling hubs
Creative Commons Zero v1.0 Universal
4 stars 2 forks source link

add capacity to link integers with category numbers for categorical targets #39

Closed nickreich closed 1 year ago

nickreich commented 1 year ago

In a categorical target definition, could we use what is currently stored in model_tasks > output_type > categorical > type_id to encode a mapping between integers and categories, so that we could have sample-based representations of categories? E.g. a table like

output_type | type id | value 
----------- | ------- | ------
"sample"    | 1       | 4
"sample"    | 2       | 3
"sample"    | 3       | 7
"sample"    | 4       | 4

where type_id corresponds to the index of the sample and value corresponds to the number corresponding to the category. Noting that categorical targets are weird because the categories can show up as the type_id for pmf output_type but also as the value for the sample output_type.

Possibly related to the question about whether output_type should be a property of a target specifically or not.

nickreich commented 1 year ago

@elray1 notes that this then makes it so you can't see what the file is explicitly, you need additional information to make the file readable.

annakrystalli commented 1 year ago

I'm guessing we will not be pursuing this idea anymore?

nickreich commented 1 year ago

Yeah, I think we should close for now.