Closed NicoledeGreef closed 1 year ago
I agree with the approach you've described.
@NicoledeGreef I'm working on this right now. Is a data set always ARCS or ORCS, or can it be ARCS and ORCS?
I have consulted with our Infor Mgmt SME and...
There may be overlap between applying ARCS and ORCS to a dataset though this is expected to be infrequent. Can we plan for the possibility that both may apply?
@NicoledeGreef I would like your thoughts on this approach. It's a bit tricky to build, but we can do it. Can you share internally for input and get back to me please?
I met with SMEs from Info Mgmt and Data Curation. We reviewed the figma link in previous comment
Here is some feedback. We agreed that this is an advanced topic for many users and therefore cannot be expected to be included when each metadata record is drafted by a business user (so we must not make this a mandatory element in order to save the record).
For the foreseeable future, we agreed to omit "Active Period" and "Semi-Active Period" from the user inputs, reducing some complexity.
It is not imperative that the values we do track are searchable (facets) within the application; Product Owner persona and Info Mgmt super user SMEs could have access to a report and that would be sufficient. It is presumed that most business users won't be aware of Info Schedule values without having a conversation with an Info Management specialist.
Ideally we would make us of pick lists based on taxonomy values in order to reduce user input inconsistencies.
Most of the items in the Finance Data Catalogue are anticipated to be ORCS but
Top level choice should be:
Information Schedule Type one of: ARCS | ORCS | Special | Unscheduled
Using ARCS as an example:
Information Schedule Name would be: Administrative Records Classification System
Schedule Number would be: 100001 is unique to ARCS (ORCS and Special each have many sched numbers; Unscheduled has none)
Based on the Information Schedule choice a user makes, a list of Information Schedule Primary Title topics would be available; if we use ARCS as an example Information Schedule Name, a subset of Information Schedule Primary Number/Title values would be as listed here:
6000 - Information Technology, General 6450 - Information System Development & Changes 6820 - Information Systems Operations 6840 - Change Management 6880 - Telecommunication Network Management 6890 - Radio Communication
if we use 6840 - Change Management as an example Information Schedule Primary, the Information Schedule Secondary Title values would be as listed here:
-00 Policy and procedures -10 General -20 IT change management records
ARCS don't change very often if at all. ARCS are listed here.
An example of a ARCS schedule code value once a user has made selections: 100001-440-20 Whose codes translate to: "Administration-440 - Reporting & Statistical Analysis-20 - Reports and statistics (not covered elsewhere)" but for display purposes the Secondary Title label is sufficient, Reporting & Statistical Analysis
An ORCS Schedule exists only after a business area has had a conversation with the Records Management SMEs within Government and an ORCS will published to the web in the ORCS Library.
There may be more ORCS that apply to the work that is being done logged in the Finance Data Catalogue than any other type (ARC, Special or Unscheduled).
You can search the ORCS Library; a search for "orcs" returned all records. The facets-like left hand bar in the ORCS Library lets you refine by Ministry and this brings it down to sixteen results. There may be ORCS from other Ministries that apply to the work being done within Ministry of Finance; we should be mindful of that but as a start Information Schedule Name could be derived from one of those 16 hits; for example:
Banking and Cash Management ORCS Business Risk Management Programs ORCS Community Initiatives & Olympic Bid ORCS Consumer Taxation ORCS Crown Agency Services ORCS Federal-provincial Relations & Research ORCS Gaming ORCS Income Taxation ORCS Mineral, Oil & Gas Revenue ORCS Officer of the Comptroller General: 2007 Edition ORCS Office of the Comptroller General ORCS Property Taxation ORCS Revenue & Student Loan Contract Management ORCS Revenue Services British Columbia ORCS Risk Management ORCS Taxation Revenue Appeals ORCS
In the case a user chooses ORCS then there should be a pick list where they can choose the Schedule Number; We will have to unearth these schedule numbers from the PDFs.
"Property Taxation ORCS" is Schedule Number 160184, for example (see Property Taxation ORCS)
There are Primary and Secondary Schedule Title numbers for an ORCS. These are found within the PDF documents in the ORCS Library (looks in Section 1)
ORCS and ARCS unique identifier is based on: Schedule Number-Primary Title Number- Secondary Title Number
An example of a ORCS schedule code value once a user has made selections: 160184-45000-03 Whose codes translate to: "160184 - Property Taxation ORCS-45000 - PROPERTY TAXATION - GENERAL -03- Property taxation data warehouse data" but for display purposes the Secondary Title label is sufficient, Property taxation data warehouse data
Special is a category outside of ARCS and ORCS (see this page)
The list is relatively brief for Special: Commission of Inquiry Records (schedule 112907) Computer System Electronic Backup Records (schedule 112910) has been superseded by ARCS secondary 6820-05 Executive Records (schedule 102906) General Records (schedule 112909) Government House Records (schedule 112911) Records of the British Columbia Commission of Inquiry into Missing and Murdered Indigenous Women and Girls (schedule 170439) Lieutenant-Governor Records (schedule 112912) Record Copies of Published Maps (schedule 112908) Records of Defunct Programs (schedule 158691) Redundant Source Information (schedule 206175) Special Media Records (schedule 102905) Transitory Information (schedule 102901) Year 2000 (Y2K) Project Documentation and Test Data (schedule 112916)
Most likely to be applied are in bold text.
Unscheduled: a Schedule has not yet been created for the subject metadata item. Means that work on records management is outstanding and until the work is complete the data records cannot be disposed of. There may be a few areas in RMO that are currently Unscheduled.
No further info needed to be supplied.
For the ARCS | ORCS | Special cases, the following are good to track:
Records life cycle
A | Active SA | Semi-active FD | Final Disposition
Final disposition categories
DE | Destruction FR | Full Retention SR | Selective Retention
Special flags
FOI | Freedom of Information/Protection of Privacy PIB | Personal Information Bank VR | Vital Records
@NicoledeGreef If I'm understanding this correctly, it's a lot cleaner than what I came up with. I think we can do it with a taxonomy and an entity reverence view.
Is it OK if we separate the life cycle? That's more difficult to put into a taxonomy, unless a life cycle is always tied to the info schedule value. For example, if 160184-45000-03 always has the same life cycle and final disposition, then it's a field we set that on a term and the user never has to think about it. It looks like that's the case in the PDF, not sure though
Question for follow-up: Are the Special Flags always associated with a Secondary value or is it dataset specific?
@NicoledeGreef If I'm understanding this correctly, it's a lot cleaner than what I came up with. I think we can do it with a taxonomy and an entity reverence view.
Is it OK if we separate the life cycle? That's more difficult to put into a taxonomy, unless a life cycle is always tied to the info schedule value. For example, if 160184-45000-03 always has the same life cycle and final disposition, then it's a field we set that on a term and the user never has to think about it. It looks like that's the case in the PDF, not sure though
By separating the life cycle, do you mean separating the A, SA and FD phases? Yes please, as I will want to query/report on these individually (e.g. show me all the datasets where the FD = FR).
Each primary/secondary will have it's own A/SA/FD values and this won't change over time. For example ARCS 6450-80 has SO 2y SR. The only wrinkle is the OPR/Non-OPR qualifier (OPR - Office of Primary Responsibility). If a business area is OPR they have one set of lifecycle values and if they are Non-OPR they have another. But... I think we have OPR at the dataset level so this is part of why the choosing of an Information Schedule is a bit of a conversation - at least for now until we bring up our knowledge in this space.
Question for follow-up: Are the Special Flags always associated with a Secondary value or is it dataset specific?
Yes, special flags are for specific secondaries. There are some examples in the Property Taxation ORCS (e.g. 45800-05: Property transfer tax data and images has a flag of PIB).
Interestingly, I also found a special flag in that schedule that was not on our list: PUR: The Taxation (Rural Area) Act (s. 22) requires that a copy of the tax roll be made available for public review.
@lkmorlan I have taken this as far as I can. Here is my update
Some of the work on this is done, see the 128-data-set-life-cycle-information-schedule branch
The idea here is to tie everything to do with info schedule to a taxonomy. That way, users only need to select the correct information schedule and everything else populates.
Most of this is set up. The taxonomy for the information schedule exists, information_schedule
. It has all the fields required to populate everything in the info schedule. Some of these fields are entity references to other taxonomy terms:
field_active_period
references taxonomy record_life_cycle_duration
field_semi_active_period
references taxonomy record_life_cycle_duration
field_final_disposition
references record_disposition
field_special_flags
references record_special_flags
The duration, disposition and special flags taxonomies all use a field_abbr_full_name
field. This is because abbreviations are used in the official info schedule specifications, but it's nice to provide a human-readable name.
Simple hierarchical select is used in the form display, see /admin/structure/types/manage/data_set/form-display/data_set_description
.
The issue is that on the build page, there is an extra field showing the TID. This should not be visible to the user.
See #1 on the screenshot below
I'm using a feature called Flexible Hierarchy in Client-side Hierarchical Select to display the first and last item in a taxonomy. This works, though if there is a better way, go for it.
NOTE: I added the schedule type, client may not want this
See #2 on the screenshot below
This has to be a field so it can be used in reports, views, etc. We can't do it with twig.
The info schedule code needs work. Right now I'm using the field_token_value
module to show the code, but it has some issues.
When an admin creates a term in information_schedule
, one field is field_schedule_number
This corresponds to the numeric code used in the info schedule spec. Depending on which schedule is used, the root term may or may not use a numeric code.
ARCS will only ever have 3 levels, including the root term.
ARCS uses a root numeric code, so the info schedule code should render, for example, as 100001-440-20, where:
ORCS will have 4 levels, including the root term.
ORCS does not use a numeric code for the root term. (ORCS). The code for ORCS should display, for example, as 160184-45000-03, where:
Special has two levels, including the root term
Special does not use a numeric code for the root term. (Special).
The code for Special should display, for example, as 112907, where:
NOTES
- field_token_value may not be the right approach. A custom token might be better.
- We can leverage a help guide, for example if we tell people to leave
field_schedule_number
empty for ORCS and special, we can generate the info schedule code by following the chain of numbers
The friendly name and the numeric code should be visually related, under the same label, Information schedule, but they need to be separate fields. See #1 on the screenshot
In the details section, see #3 in the screenshot, the field_token_value module is used here. This is fine, except values don't update unless the node is saved. That's not a huge deal as the info schedule won't change much. It would be nice though that if something update in the term it would update here. Not something to spend a lot of time on, but if we have something quick that is more dynamic, it would be a good idea.
For active, semi-active and final disposition, we should have Not applicable if there is no value. This serves the purpose of letting the user know a value was not missed in error. We can accomplish this by setting 'NA' as the default value and requiring the field.
For Special Flags, this is rarely used. Here, the field and label should not be visible unless there is a value.
Screenshot
Further changes to what I have done.
Final disposition and special flags don't need to be fields here. They can be done in twig.
This works, assigning to myself to complete documentation
assigning to Liam.
There is an issue with generating the code
When I do the following, it fails
see the problem. It loads the parents using the term ID, which doesn’t exist until it is saved. @lkmorlan via slack
@NicoledeGreef this is ready for your review.
Please follow the documentation here https://mfin-data-catalogue.apps.silver.devops.gov.bc.ca/documentation/information-schedule
That way, you can test the docs and the feature at the same time.
@NicoledeGreef - I am interested in seeing how this works in action - might be worth setting up a time to chat/demo? We might want to drop some pieces for now (retention, flags, etc) and focus on the core information first. There are more nuances as more complexity is added and even just having high level schedule info is an improvement over current state. There has been recent feedback regarding rolling out new IM things in bite-sized pieces to make it less overwhelming given our maturity in this space.
OP timer
https://openplus.monday.com/boards/4092908516/pulses/5007970088
Discussed in https://github.com/bcgov/MFIN-Data-Catalogue/discussions/113