RDFBones / RDFBonesPhaleron

An RDFBones implementation of the data collection routines developed for the Phaleron Bioarchaeological Project.
1 stars 0 forks source link

Skeletal Inventory Output Issues #62

Open jessica-rothwell opened 2 years ago

jessica-rothwell commented 2 years ago

Is your feature request related to a problem? Please describe.

The current output of the Skeletal Inventory, to my understanding, is an attribute table (in the form of a .csv file) wherein the “ROI” is the Object with the following Attributes: “ID”, “Measurement Type”, and “Measurement”.

While this is functional for machine communication between database modules, this output is not useful if a researcher needs to read the output. Currently there is no way to sort the data by skeletal region, or even by skeletal element. Rather, the output of the .csv file sorts the data in alphabetical order.

This is a major problem because, it is absolutely necessary for researchers to be able to read the inventory to check for data quality. This is not a matter of “making it pretty” but rather it is an issue of fundamental database functionality and meeting the needs of the research team. As it stands, the excel spreadsheets are still more useful than the database output in terms of actually being readable and usable.

Describe the solution you'd like One possible solution to this would be to add additional attributes to the attribute table to make it easier for researchers to sort the data into an order that is anatomically meaningful. For example: the attributes of “Skeletal Element” and “Skeletal Region” could be added for each ROI so that it is possible to connect each ROI to the specific skeletal element and skeletal region it refers to. Ideally the subcategories for these attributes should be easily sortable in Excel either numerically or alphabetically. If we would like to make the Objects sortable into a similar order as one would see in the Excel sheet, it may even make sense to add another attribute of “Data Collection Order” or something like that where each ROI is assigned a number that corresponds to the order we would like it to show up in the .csv output. That way the data would be numerically sortable for researchers interested in reading the data output.

15E601 commented 10 months ago

This issue has been partially solved according to the solutions noted in RDFBones/Phaleron-SkeletalInventory#22 The system currently in place goes by the following sorting hierarchy (always in descending alphabetical order for each tier):

  1. Section
  2. ROI
  3. Type of measurement (completeness/observability, etc.)

This method goes most of the way of reflecting the order we have in the AnthroGraph interface, but not fully due to the alphabetical ordering. That we start with the arm section instead of the cranium I can see as a manageable change, but for the ROIs I could see this getting more annoying, e.g. the Atlas "randomly" being at the very bottom of the spine:

image

To fix this, we could add sorting directives where we manually decide which element goes where. This requires manually connecting each ROI with a sorting class, which is a lot of work, so I would like to know beforehand:

  1. is this required, or is the current system sufficient?
  2. what shall the sorting method be? My suggestion would be to sort exactly as the ROIs are listed in AnthroGraph. So start at cranium section, left frontal pars orbitalis, observability and completeness, then right frontal pars orbitalis obs. and completeness, etc. etc.
cuboideum commented 10 months ago

New sorting directives need to be introduced that reference both the ROI and the measurement datum. The combination of these two parameters is what makes each line of the query unique.

Also, bear in mind that sorting directives need to be context-specific, i.e. defined for a certain dataset.

15E601 commented 10 months ago

The combination of these two parameters is what makes each line of the query unique.

Why is this mandatory? Are ROIs precluded from having sorting directives assigned to them? Why is it not acceptable to have the order default to Observability then Representation in the list as it is now?

Edit: Wait, I think I misunderstood that. You mean we need separate directives for ROIs and one for MDs respectively, yes? That makes more sense, though in the ordering of the lists the MDs still doesn't really matter that much.