Software and equipment schema issues

kobbejager commented 6 months ago

[x] Software|Equipment > has_defined_components > component_type: needs a controlled list. At the moment this is an "examples" list with "Detector, Column, Camera", both in software and equipment. This will need to be two different things
[x] Software|Equipment > has_defined_components > tool: searches for equipment and software, but can a software be composed of equipments?
[x] #20
[x] Software|Equipment > has_url > link_type: needs a controlled list? Now a fixed enum with values "website, documentation, source code, demo, online resource, other"
[x] Software > licence: has "format": "uri". Do all software licences have urls? Maybe to complex, but we could also give the opportunity to choose between (1) a controlled list of preset licences, (2) a url, (3) a large textbox

jpadfield commented 6 months ago

1: That makes sense, two lists, one for software components and one for hardware components. I do wonder how much this will be used though :-)

2: A software system could have hardware components - robotics, storage, CPUs/GPUs, etc. but I would expect that most will not have. I suppose it will come down to whether the components will alter how the results are interpreted as apposed to just a point in information.

3: See JSON blocks below as a start:

{
  "DataInputTypes": {
    "TextualData": ["PlainText", "Documents", "Emails", "WebPages"],
    "NumericalData": ["SpreadsheetData", "TimeSeriesData", "TransactionalData"],
    "MultimediaData": ["Images", "Audio", "Video"],
    "ScientificAndTechnicalData": {
      "Basic": ["SensorData", "GISData", "GenomicData", "ChemicalStructures"],
      "Detailed": {
        "HierarchicalData": ["Multi-LevelJSON_XMLData", "ComplexSpreadsheets", "HierarchicalDatabaseRecords"],
        "DetailedGeographicData": ["Multi-LayerGISData", "DetailedTopographicalMaps"],
        "ComplexScientificData": ["Multi-DimensionalArrays", "DetailedGenomicSequences", "NestedChemicalStructures"]
      }
    },
    "StructuredData": ["Databases", "XML_JSONData"],
    "UnstructuredData": ["Logs", "SocialMediaPosts", "ForumThreads"]
  }
}

{
  "DataOutputTypes": {
    "Visualizations": ["Charts", "Graphs", "Maps", "Dashboards"],
    "StatisticalReports": ["SummaryStatistics", "CorrelationReports", "RegressionAnalysis"],
    "PredictiveModels": ["ClassificationModels", "ForecastingModels", "RiskAssessmentModels"],
    "TextualAnalysisOutputs": ["SentimentAnalysis", "TopicModeling", "TextSummarization"],
    "DataFiles": ["ProcessedDatasets", "ModelCheckpoints", "LogFiles"],
    "TechnicalReports": ["PerformanceAnalysisReports", "AuditTrails", "CodeAnalysisReports"],
    "MultimediaOutputs": ["EditedImagesOrVideos", "AudioTranscriptions", "3DModels"],
    "Advanced": {
      "AdvancedVisualizations": {
        "InteractiveDashboards": ["Drill-DownCapabilities"],
        "Multi-LayerMaps": ["DifferentLayersOfGeographicData"],
        "ComplexGraphs": ["HierarchicalRelationships", "Multi-DimensionalDataPoints"]
      },
      "DetailedAnalyticalReports": {
        "ReportsWithNestedSections": ["DetailedBreakdowns"],
        "Multi-LevelStatisticalAnalysis": ["OverallTrends", "SpecificsDrillDown"]
      },
      "ComplexPredictiveModels": {
        "ModelsWithMulti-LevelOutputs": ["PrimaryPrediction", "SecondaryPredictiveFactors"],
        "HierarchicalRiskAssessmentModels": ["OverallRiskScores", "SpecificRiskFactorsBreakdowns"]
      },
      "StructuredDataOutputs": {
        "NestedDataFiles": ["JSON_XMLFilesWithComplexStructures"],
        "DetailedModelOutputs": ["LayersOfInterpretationOrAnalysis"]
      }
    }
  }
}

4: Ok as a start:


{
  "URI_Types": {
    "ProjectResources": {
      "Website": {
        "Description": "Main website or homepage of the project.",
        "Examples": ["Official project homepage", "Product landing page"]
      },
      "Documentation": {
        "Description": "Technical documentation, user guides, and API references.",
        "Examples": ["API documentation", "User manuals"]
      },
      "SourceCode": {
        "Description": "Repositories hosting the source code of the project.",
        "Examples": ["GitHub repository", "Bitbucket repository"]
      },
      "Demo": {
        "Description": "Demonstrations or live examples of the project in action.",
        "Examples": ["Interactive demos", "Live application demos"]
      },
      "OnlineResource": {
        "Description": "Miscellaneous online resources related to the project.",
        "Examples": ["Related web applications", "Supplementary online tools"]
      },
      "MultimediaContent": {
        "Description": "Video and audio content related to the project.",
        "Examples": ["Tutorial videos", "Podcast episodes"]
      },
      "Publication": {
        "Description": "Academic and professional publications related to the project.",
        "Examples": ["Research papers", "Conference proceedings"]
      },
      "FAQ": {
        "Description": "Frequently asked questions about the project.",
        "Examples": ["Product FAQ", "Technical issues FAQ"]
      },
      "Support": {
        "Description": "Support and contact information for the project.",
        "Examples": ["Support center", "Contact form URL"]
      },
      "Forum": {
        "Description": "Discussion forums or community platforms related to the project.",
        "Examples": ["User forums", "Developer communities"]
      },
      "SocialMedia": {
        "Description": "Social media profiles and pages related to the project.",
        "Examples": ["Twitter profile", "Facebook page"]
      },
      "Download": {
        "Description": "Links to download software, datasets, or other materials.",
        "Examples": ["Software download page", "Dataset download link"]
      },
      "ReleaseNotes": {
        "Description": "Information about the version history and updates.",
        "Examples": ["Changelog", "Update history"]
      },
      "License": {
        "Description": "Licensing information for the project.",
        "Examples": ["Open-source license details", "Commercial license terms"]
      },
      "APIEndpoint": {
        "Description": "Endpoints for accessing APIs provided by the project.",
        "Examples": ["REST API endpoint", "GraphQL endpoint"]
      },
      "TrainingMaterials": {
        "Description": "Educational materials and training resources.",
        "Examples": ["Online courses", "Workshops"]
      },
      "CaseStudies": {
        "Description": "Examples of the project or product in use.",
        "Examples": ["Customer success stories", "Use case descriptions"]
      },
      "Testimonials": {
        "Description": "User feedback and testimonials about the project.",
        "Examples": ["Customer reviews", "User testimonials"]
      },
      "Gallery": {
        "Description": "Image galleries showcasing the project or its use cases.",
        "Examples": ["Product image gallery", "Event photos"]
      }
    }
  }
}

5: The flexibility is good, but I wonder if it makes it to complex? I would say it is reasonable to require the licence to be "published" and accessible via a URL - I suppose the best of both worlds would be an "example" set of common urls for licences, and then allow people to add their own institutional URL as needed.

kobbejager commented 6 months ago

1: That makes sense, two lists, one for software components and one for hardware components. I do wonder how much this will be used though :-)

Probably not. Given the large number of possible values, it might be sensible to either turn it into a "component_function" free text field, or remove it altogether?

2: A software system could have hardware components - robotics, storage, CPUs/GPUs, etc. but I would expect that most will not have. I suppose it will come down to whether the components will alter how the results are interpreted as apposed to just a point in information.

This is a chicken-or-egg discussion. I would argue that the computer or robot is the main system and the software is a component of this ;-) We can leave this as is.

3: See JSON blocks below as a start:

Good starting point. Would you, in a flat controlled list (or fixed enum?) go to the lowest level of detail?

4: Ok as a start:

Seems ok.

5: The flexibility is good, but I wonder if it makes it to complex? I would say it is reasonable to require the licence to be "published" and accessible via a URL - I suppose the best of both worlds would be an "example" set of common urls for licences, and then allow people to add their own institutional URL as needed.

Not all licences have URLs. In case of software, licence files are commonly added to the source, in for example in the MIT licence, one needs to put his/her name in the text file as author, and this file might not be directly linkable (in a zip file for example).

On the other hand, do we actually need to register the exact licence? Could we not have a simplified enum with

public domain (CC0, The Unlicense)
permissive licence (CC-BY, MIT, BSD, Apache)
Copyleft (protective) licence (CC-BY-SA, GPL, AGPL)
Noncommercial licence (CC-BY-NC, ...)
Proprietary licence

(source: https://en.wikipedia.org/wiki/Permissive_software_license)

E-RIHS / schema

Software and equipment schema issues #14