linkml / linkml

Linked Open Data Modeling Language
https://linkml.io/linkml
Other
302 stars 91 forks source link

generate xlsx spreadsheet templates from a schema #110

Closed cmungall closed 2 years ago

cmungall commented 3 years ago

Given a schema S, generate an xsls file (or google sheet) with one sheet for each class C (or a user-selected subset).

Each tab will have a column C for each slot inferred to be applicable for C

If the range of a slot is an enum, constrain the possible values

By default only include ranges that are primitive, but make the configurable (e.g. mixs templates have values that are classes with a has_raw_value slot, cc @wdduncan )

This will use the existing generator framework. The python class will be generators.excel_template. use an existing one like jsonschemagen as code template

TBD: for ranges that are ontology classes, see if there is a way to configure for rightfield/populous

For ranges that are enums, use drop downs. (google sheets: https://support.google.com/docs/answer/186103?co=GENIE.Platform%3DDesktop&hl=en)

Additionally, create a data dictionary tab

TBD: library/framework

Do we have both google sheets and excel writers, or target one and generate from the other?

There are a few options, the main two seem to be

I think native support is quite limited. I am not sure yet of the differences between the other two but am tending towards openpyxl

The validation support seems good https://openpyxl.readthedocs.io/en/latest/validation.html

Some notes on existing frameworks:

sujaypatil96 commented 2 years ago

@cmungall: I was reading through this, and I just wanted to leave some comments on this issue to make sure I've understood it correctly.

Do we have both google sheets and excel writers, or target one and generate from the other?

Yup, I think that would be best. Not sure which one's easier though, going from Excel to Google Sheet or vice versa. I've played around with Google APIs, so I can implement both. For Google Sheets, there's aspects of authorization that the user has to go through with the necessary credentials, to generate or render the sheet even. And there's a fair bit of setup of the application on the Google Developers Console too. Perhaps we can follow the first route of going from Excel to Google Sheets for now?

As for the library to use, I was going through awesome-python, and there are following libraries:

The second and third libraries seem to be the more popular and most maintained ones.

cmungall commented 2 years ago

Sounds like a plan!

Happy to go with excel as first target

On Thu, Aug 12, 2021 at 11:20 AM Sujay Patil @.***> wrote:

@cmungall https://github.com/cmungall: I was reading through this, and I just wanted to leave some comments on this issue to make sure I've understood it correctly.

  • If we were to consider this linkml example https://github.com/linkml/linkml#examples schema, there are three classes (organization, employee, manager), so there'd be three worksheets in an Excel workbook
  • Each of these worksheets will have inferred columns, based on the slots associated with each class. For ex., for class organization, we'd have three columns in the organization worksheet: id, name and has boss
  • Check the slot definition in the schema. Check slots like [slots][id] (without a range) and other slots like [slots][has boss] (with a range) and determine if they're primitive or not
  • Create data dictionary worksheet as the last sheet of the workbook

Do we have both google sheets and excel writers, or target one and generate from the other?

Yup, I think that would be best. Not sure which one's easier though, going from Excel to Google Sheet or vice versa. I've played around with Google APIs, so I can implement both. For Google Sheets, there's aspects of authorization that the user has to go through with the necessary credentials, to generate or render the sheet even. And there's a fair bit of setup of the application on the Google Developers Console too. Perhaps we can follow the first route of going from Excel to Google Sheets for now?

As for the library to use, I was going through awesome-python https://github.com/vinta/awesome-python, and there are following libraries:

The second and third libraries seem to be the more popular and most maintained ones.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/linkml/linkml/issues/110#issuecomment-897866414, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAMMOK4IMCQFW3SJF66DR3T4QGGJANCNFSM4Z2LXM4Q . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&utm_campaign=notification-email .

turbomam commented 2 years ago

Excited to see @sujaypatil96 taking on an issue that I have dodged since my first day.

@cmungall can I be taken off the assignee list?

sujaypatil96 commented 2 years ago

Posted a query in PR #302 here.