acdh-oeaw / dhcr-main

Digital Humanities Course Registry Application
https://dhcr.clarin-dariah.eu/
Apache License 2.0
3 stars 0 forks source link

Add additional metadata field for registering RI datasets / training / learning material used in a course or programme #48

Open vronk opened 1 year ago

vronk commented 1 year ago

Provide explicit definition of a "MOOC" and also other types of courses, maybe in form of a glossary

Edit on 2024-05-14 by Patrick: The purpose of this issue has changed, please check here: https://github.com/acdh-oeaw/dhcr-main/issues/48#issuecomment-1724269045

IvdL22 commented 1 year ago

Thank you @vronk for creating this task. I noticed that we already have a draft document with definitions started by @PixlTracer: https://docs.google.com/document/d/1FMquEoMn7EIOHH6I00K_518cDIVzBsvShUnKCut0FGE/edit?usp=sharing. We will finalise the glossary in that document and notify you when done.

patrickakk commented 1 year ago

@IvdL22 @PixlTracer :

Would it be possible to provide 2 examples of a MOOC? With 2 links to more detailed information? So that we can use that as reference when working on this topic?

IvdL22 commented 1 year ago

@patrickakk cc @PixlTracer We now have a glossary here where you can check the terms and examples: https://docs.google.com/spreadsheets/d/1rXZQa-TsxTKXvyfZx4zeUxkC1_kqODO4/edit?usp=sharing&ouid=107443344257746111643&rtpof=true&sd=true

patrickakk commented 1 year ago

@IvdL22 @PixlTracer @vronk Summary from today's meeting:

It's no longer the intention to include MOOC's in the DHCR.

The open educational resources from Dariah Teach should be included. One example provided by Iulliana is: https://teach.dariah.eu/course/view.php?id=68&section=2

Important information fields are: title, description and link.

The next step would be for Patrick to analyse the structure of the new items and see if and how we can integrate this in the registry.

Is that a correct summary?

Note: since this is additional work, I'll move this task to the mai23 milestone.

IvdL22 commented 1 year ago

@patrickakk I agree with your summary and suggestion.

IvdL22 commented 1 year ago

@patrickakk @PixlTracer I have tested this by adding one course to the registry: https://dhcr.clarin-dariah.eu/courses/my-courses

You as admin can view it. It is not listed.

Conclusion: the metadata needs to be more flexible in order to be able to include open-source educational courses.

Institution - I have created DARIAH-TEACH City - not applicable Country - not applicable ECTS - 1 Start date - anytime Duration - it could be 1 week or 1 month, depending on the learner Discipline - I could not select DH so I selected Other Technique - I did not know which one to select. Could we add Other as an option? Objects - DH

patrickakk commented 1 year ago

Moved to July milestone since the specs are not finished. Maybe we need to move it again, depending on how long this takes.

patrickakk commented 1 year ago

@PixlTracer @IvdL22 @vronk With this comment, I'll try to summarize the meeting from 2023-05-17 as well as propose some solutions:

Characteristics

Open education resources could be used by teachers to include them into other courses. As well they could be used by students.

Question: We used various terms. Which one is correct? Is this: Open Education Resource (OER)?

Currently they can be found on multiple locations: -DARIAH [ca 16 courses]: https://teach.dariah.eu/course/index.php -CLARIN [ca 10 courses]: https://www.clarin.eu/content/training-materials -Upskills [ca 6 courses]: https://upskills.fil.bg.ac.rs/

When adding them to the DHCR, the metadata of all of them will be available at one location, as well as the complete overview can be accessed through the API.

An example for recent updated(public shown), PhD courses though the API can be found at: https://dhcr.clarin-dariah.eu/api/v2/courses/index?recent&course_type_id=4 The same could be available when a new education type "Open Education Resource" will be created.

It was proposed to enter the courses manually, since this is a small amount of courses and the metadata is not expected to change often. The information could be checked/updated yearly by the administrators (Anna & Iulliana).

The characteristics of the OER's conflict with the current data model. Three options were discussed:

  1. Add one course which "summarizes" the OER's at one place. (Similar to ACDH tool gallery). In this case it might be difficult for a user to find the details.
  2. Create a separate list/database/table/data model for the new characteristics. This is almost the same as creating a separate registry and requires a lot of development hours. As well the items won't be available though the current filter/search options.
  3. Create a new Education Type and define a "set of rules" how we can deal with the differences in structure. This might be the preferred option, depending on how we can change/adapt the current validation rules.

Which characteristics are different? OER's have:

-No physical present institution and department -No city and country and no location on the map (lat, lon) -No start date and is not recurring -No fixed duration [unit and type] (that would depend on the student) -No lecturer name and email = no problem, since those fields are not required -No entry requirements? = no problem, since this field is not required -Are there Tadirah objects which apply?

The goal is that the data model represents the real world. Currently the validation rules apply to all courses, which means there are no exceptions based on education type. For example every course has to be at an existing institution and that has to be on a physically existing location. Another example is that every course needs to have at least one start date. The presence of these data is also important when users are using the filter or sort options.

We could try to implement different validation rules, which behave different, based on the Education Type. In case we choose this solution, I'll have to take a look at this further, to see if the model supports this.

What should the new validation rules do?

Institution and location difference

No start date available

No fixed duration [unit and type]

We could use the same approach as for the location difference: Create a new duration unit "Flexible" which is required for OER courses and can not be used by other courses. Would that be a good solution?

How can this new feature be communicated / be visible for users? How should OER courses be visible / findable for users?

  1. The user can filter on the new education type "Open Education Resource". (Use filter button)

  2. A section could be added in the menu, which contains an explanation of OER's as well as a button which automatically activates the filter and shows all the OER's. An example of clicking on this button, for the course type "PhD", would look like this: https://dhcr.clarin-dariah.eu/?course_type_id=4

  3. A short url could be created, for easy dissemination, for example: https://dhcr.clarin-dariah.eu/open-education-resources Is this needed? Which short url is preferred?

  4. Is anything else needed?

Is this summary correct? Did I miss anything? Are there suggestions/additions? And finally, what do you think of the proposed solution?

vronk commented 1 year ago

I have to admit, I am somewhat surprised about the turn this task took. As the analysis also shows, the educational resources are ontologically quite a different animal than courses: the former are some kind of digital object, the latter an activity.

And! For a catalogue of resources we have alone in DARIAH the DARIAH Campus: https://campus.dariah.eu/resources/page/1 besides many other catalogues and directories. Especially dariahTeach materials are also already included in DARIAH-Campus.

So I am decidedly against extending the functionality to support OER as first-class citizens next to courses. In my recollection, the original idea was to allow for links/pointers to resources (learning materials) pertaining to existing course, not a separate index of learning materials.

PixlTracer commented 1 year ago

Matej and me had a chat about this and we propose to have a call with Iulianna and Toma to discuss and find a suitable solution

patrickakk commented 1 year ago

Removed from milestone as discussed in meeting

vronk commented 11 months ago

As discussed today with Toma Tasovac (see agenda & minutes):

We want to keep a clear distinction between courses as Activities (event with start and end date and a location) and as Digital objects (like training materials available online).

And restrict DHCR to register and present the "Activities" and use DARIAH Campus for the "Digital objects".

That means that "online courses" available e.g. in dariahTeach or from the upskills project should not be registered in DHCR, but rather in D-C as external resources.(Indeed courses from dariahTeach already are featured in D-C.)

However, it should be possible to record, if there is training material available accompanying the corresponding activity. This should be in form of a field to be populated ideally with a URL pointing to the online material (ideally hosted in a stable manner). Indeed such material can be made available through DARIAH Campus, in which case the entry in DHCR would point to the corresponding entry in D-C.

patrickakk commented 11 months ago

@vronk Thanks for the explanation.

@vronk @PixlTracer @IvdL22 Can we close this issue? If yes, please change the label to "Done" or click on Close ;)

PixlTracer commented 11 months ago

Since we agreed to introduce a metadata field, where the course owner/lecturer can leave a link that points to the training/learning material available (see @vronk's comment above), I'd like to leave this issue open, until the metadata field has been implemented;

--> to do @patrickakk: introduce a (non-mandatory) metadata field in which a link can be set that points to learning material hosted/stored elsewhere

proposal: when adding a course via https://dhcr.clarin-dariah.eu/courses/add, I propose to insert the following metadata field:

dhcr new metadata field for course material

@IvdL22 please review the wording and indicate changes/give green light.

thank you!

patrickakk commented 11 months ago

@PixlTracer Thank you for pointing that out.

Can we specify what is needed:

(For now I moved it to the November milestone. After the specifications are clear, we could talk about the milestone for the implementation?)

PixlTracer commented 11 months ago

@PixlTracer Thank you for pointing that out.

Can we specify what is needed:

  • One new text field (as shown on the screenshot above)? yes
  • Are we sure there will always be one or no link? This means not more than 1? Does that apply in all cases? we will not know. we could offer a 2nd link field (which might not be used much)...
  • Should there be a simple check if it contains an url? (http) yes, good point! check for https://
  • Should there be a url validation check? absolutely!
  • Should that only be done on entry/update or also on regular basis? what do you mean with 'regular basis'? url validation check every now and then? -- would be useful, but how to handle invalid links then? remove? contact course owner? (question also for @IvdL22)

(For now I moved it to the November milestone. After the specifications are clear, we could talk about the milestone for the implementation?)

patrickakk commented 3 months ago

Preview of user interface with dummy data.

Public available info at Course Detail page

image

New field in login area at Course add or edit

image

patrickakk commented 3 months ago

@IvdL22 cc @PixlTracer Is this the correct summary of what was said in the meeting on the 10th of June?

CLARIN needs to possibility that links (=plural) to datasets used in a course, are added to the course metadata. DARIAH needs to possibility that links (also plural?) to training material that is (re)used in a course are added to the course metadata.

On the other hand, based on the requirements here: https://github.com/acdh-oeaw/dhcr-main/issues/48#issuecomment-1755445882

Are we sure there will always be one or no link? This means not more than 1? Does that apply in all cases? we will not know. we could offer a 2nd link field (which might not be used much)...

and

Should there be a url validation check? absolutely!

The current (almost finished) implementation only contained one text field, with place for one link and a link checker that required a valid http status code. So it's not possible to enter more than one link.

The information above almost certainly needs more than one item to be added, which should be implemented in the data-model in a completely different way. (One to many relationship).

Based on the new information above the current implementation does not provide what is needed.

During the meeting it was decided to discuss what is needed during the WG meeting on the 19th June.

I'll revert and not commit the work already done, put the issue on hold and move it to the July milestone. Maybe we can all agree on what is needed soon?

IvdL22 commented 2 months ago

Hi @patrickakk cc @PixlTracer Thank you. The way I see this implementation is the following:

Description: If you use a CLARIN or DARIAH resource in your course, be it training material, dataset, tool or service, please add the URL.

Optional field 1: CLARIN resource

Option field 2: DARIAH resource

What do you think? Would this be possible to implement? I could add the text like this to the slide and then discuss with the WG.

patrickakk commented 2 months ago

@IvdL22 @PixlTracer

I would suggest to go a (few) steps back in the process: Why do we develop this feature?

Is this because CLARIN and DARIAH want to know which resources are used and in which courses or how often they are used? In that case you want to measure something, which makes it an important feature? And in that case, why do you want to ask during a WG meeting what they need?

What do we do when there is more than one link to add?

In the case of a CLARIN dataset, do you expect they always only used one? What do you do when they used more than one and can't enter the information? Then your report is unreliable?

Do we need to add attributes to the link? (Source: DARIAH/CLARIN, Type: Dataset, tool, training material, etc.)?

How do you want to see/summarize the data which is entered?

With the requirements/wishes currently available, I would suggest the following steps:

a) Iulliana and Anna provide at least three(3) real world examples of courses and the links that can be added

b) We create a dummy preview of a report for both organizations and check if that contains the information that's needed, in the format/with the attributes needed.

c) Based on point B, we decide which information, attributes and data structure is needed.

d) Dummy user interface preview. Now, this needs to be accepted before proceeding to the next step.

e) Decide if the new fields should be provided by the API as well.

f) Technical implementation

This is probably not a small feature. Since the larger amount of working hours needed for this, I would suggest to have at least 1 dedicated meeting about this, where at the end, everybody commits to a set of requirements to avoid doing the same work over and over again. And at least involve Matej @vronk as well.

And please think about this: It's easy to make changes to the wishes now, more complicated when the feature is finished and way more complicated when some data has been entered. So what do you need from this feature in 1 year from now? A small misunderstanding now, can cause a lot of additional working hours later.

vronk commented 2 months ago

The idea is that there can be more than 1 URL reference for datasets/training material (or other resources), no distinction between CLARIN and DARIAH

a separate 1:N table: external_resources

[<“label”,”URL”,”type”, “affiliation”>,..] 

“label” is an open text describing the resource (optional) “type”= Dataset, Training Material, Service, Software, …(optional) “affiliation” = CLARIN, DARIAH, … (optional) {label}/{URL}

{affiliation}{type}: {label}
{URL}

CLARIN Dataset: CLARIAH.NL corpus of child speech https://clariah.nl/…

Link checking upon submission (optionally if little effort) accept HTTP Status >= 200 < 400

The new fields should be available via the API too.

vronk commented 2 months ago

@IvdL22, @PixlTracer could you please provide at least three real world examples of such external resources according to the proposed data structure

IvdL22 commented 2 months ago

@vronk @patrickakk @PixlTracer Matej, thanks for the technical solution.

Example This is a course in the registry: Puheen analyysin perusteet (Introduction to Speech Analysis)

“label” Introduction to Speech Analysis “type”=Training Material “affiliation” = CLARIN {label}/{URL] https://www.clarin.eu/content/introduction-speech-analysis

“label” Route to a Wing Corpus “type”= Dataset “affiliation” = FIN-CLARIN {label}/{URL] http://urn.fi/urn:nbn:fi:lb-2020112929

The same course also uses the CLARIN VLO to show students how to search for other corpora, so I could also add the service.

Is this example enough?

Edited by patrickakk on 2024-07-03. Reason: Fixed links

patrickakk commented 2 months ago

@IvdL22 @PixlTracer cc @vronk Thank you for providing one example. Would it be possible to provide the other two examples as well?

Could you include at least one example which uses DARIAH resources? And in general: include as much exceptions and complicated situations as possible?

@IvdL22 Could you also specify how the VLO should be added to the example you provided?

IvdL22 commented 1 week ago

@patrickakk cc @vronk @PixlTracer Regarding the VLO, if a teacher is using a corpus found via the VLO, the labels will be the same:

“label” Nijmegen corpora of casual speech “type”= Dataset “affiliation” = CLARIAH-NL {label}/{URL] https://hdl.handle.net/1839/2581a242-fde3-4349-ab06-4920a964803d

Regarding the DARIAH Campus: if a teacher is using a learning or training resource in the DH programme entered in the registry, the teacher should be able to add the link to the resource:

“label”Formal Ontologies: A Complete Novice's Guide “type”=Training Material “affiliation” = DARIAH CAMPUS {label}/{URL] https://campus.dariah.eu/resource/posts/formal-ontologies-a-complete-novices-guide

Please let me know if some things are not clear. Thank you for your help.

patrickakk commented 1 week ago

@IvdL22 cc @PixlTracer @vronk

Thank you for the examples. Do you agree, when considering the specifications here: https://github.com/acdh-oeaw/dhcr-main/issues/48#issuecomment-2167415842 that the affiliation in both last examples should be "CLARIN and "DARIAH" ?

Since according to :

“affiliation” = CLARIN, DARIAH, … (optional)

There are only two options for this value. The idea behind this was that a resource belongs either to CLARIN or to DARIAH (or there is an exception).

Is there a special reason for specifying this more detailed?

This is an important difference, please let me know if you think otherwise or have any questions.

IvdL22 commented 1 week ago

@patrickakk Both CLARIN and DARIAH have national consortia and repositories, so the affiliation can also be CLARIN-FIN, etc. As @vronk suggested, there could be more values, not only two. The course contributors should be free to fill the institution hosting the resource they use in teaching.

patrickakk commented 1 week ago

@IvdL22 cc @PixlTracer

Thank you for the explanation and valuable insights. I was aware of the national consortia, but until now it's wasn't clear to me that you wanted to specify on that detail level. Maybe situations like this are also a good example for everbody, why it is useful to ask for examples and ask such questions in this (early) stage of the development process?

  1. Type of values You also mentioned "course contributors should be free to fill..." To avoid any misunderstandings about that, I will start with describing that kind of input that was meant and as second step we could discuss the values.

[<“label”,”URL”,”type”, “affiliation”>,..] “label” is an open text describing the resource (optional) “type”= Dataset, Training Material, Service, Software, …(optional) “affiliation” = CLARIN, DARIAH, … (optional)

Does that clarify a bit?

  1. Kind of affiliation As far as I understood, the idea was to let them choose from either CLARIN, DARIAH, or none. Now you mentioned it should be possible to specify the national consortia. If we change this how should it be possible to answer for example the following questions, using the dataset: -Which courses use a DARIAH resource? -Which courses are in the german language and use a CLARIN resource?

A possible solution could be to define an affiliation parent type. Values for this could be DARIAH or CLARIN? And define the national consortia as affiliation child types. Are we sure that every national consortia belongs to only one parent type? And of course this makes the implementation more complex, which means more working hours are needed.

On the other hand, the example provided by vronk shows something different:

CLARIN Dataset: CLARIAH.NL corpus of child speech https://clariah.nl/…

affiliation = clarin type = dataset label = CLARIAH.NL corpus of child speech

Are we still all on the same page? Is this understandable or should we discuss this in a (short) meeting?