hubverse-org / hubDocs

https://hubverse.io
4 stars 5 forks source link

Description of model output columns is unclear #115

Closed lshandross closed 4 weeks ago

lshandross commented 2 months ago

In the Formats of model output subsection of the model-output page, the way that the different columns are explained seem to differ between the first paragraph and the bullet points below, which can lead to confusion.

In the paragraph, the documentation says that "columns define: (1) the model task, (2) specification of the representation of the model output, and (3) the model output value." This explanation separates the value column as distinct from the output_type and output_type_id columns. However, in the bullet points below, all three of these columns are described as part of the "model output representation" category.

We should choose a single way of defining the groupings, i.e. deciding whether or not the value column is considered part of the "model output representation" group, and update the documentation for consistency to avoid confusing users.

elray1 commented 2 months ago

Good suggestion, Li, thanks. My personal preference is to describe value separately from output_type and output_type_id, because the value is provided by the modeler while the others are specified by the hub.

micokoch commented 1 month ago

@lshandross - I used the hubEnsembles manuscript verbiage to clarify the issue you brought up. I used much of your own wording (I hope that's okay), because it seemed better than me paraphrasing an already clear explanation. @elray1 - I assumed that, since it's in the manuscript, it had been agreed that the output_type, output_type_id, and value are all part of the "model output representation. If this is incorrect, then, it should probably be changed in the manuscript as well. I put all the changes in PR #117. If approved, then I will close this issue. Thanks for taking a look at this.

lshandross commented 1 month ago

@micokoch I had written the content in the manuscript based on the bullet points in the documentation and hadn't realized until several weeks ago that the documentation was not consistent, so I don't feel like the output_type, output_type_id, and value columns have definitively been decided to be part of the "model output representation." Thus, I feel like we should make a conclusive decision one way or another, then make changes in the appropriate places based on that decision.