BiologicalRecordsCentre / iRecord

Repository to store and track enhancements, issues and tasks regarding the iRecord website.
http://irecord.org.uk
2 stars 1 forks source link

Recording form to support dynamic occurrence attributes #222

Closed DavidRoy closed 5 years ago

DavidRoy commented 7 years ago

There may be an issue to cover this already, but I can't find it (although related to #13.
Also linked to same requirement for the iRecord App https://github.com/NERC-CEH/irecord-app/issues/14

There is a requirement for a multi-taxa species list recording form for iRecord to capture occurrence attribute values dynamically, e.g.

The requirements are:

Could do with some discussion as to the best way to implement this

kazlauskis commented 7 years ago

As a guess, it might be a good idea to assign each taxa group to different surveys altogether and then create a form that dynamically builds itself depending on the survey's attributes. In general, such a form would be handy as all I would then need to do is just specify the survey ID and the form would build itself.

JimBacon commented 7 years ago

Karolis suggests multiple surveys with fixed attributes as opposed to one survey having multiple attributes that are dynamically selected. An advantage of this is that it is what we already have configured. A consequence (disadvantage ?) is that it leads to a proliferation of surveys.

To pursue this option would need a look up from species on a website to survey. Since it is the warehouse that knows the species list I envisage something like a taxon_meanings_websites table. It would have columns taxon_meaning_id and website_id to perform the look up and return the survey_id to use.

We could also add a url column to the table to allow redirection to an existing form rather than constructing a dynamically built form which is likely to be less well organised.

It would be great to be able to use taxon_groups rather than taxon_meanings as the table would be so much smaller and easier to construct. Could all beetle schemes agree a single set of attributes though?!

Question to David - is the requirement really per scheme rather than per taxonomic group?

DavidRoy commented 7 years ago

It is definitely recording schemes, which clearly hasa relationship to taxonomic groups (and taxon_groups within UKSI).

We are currently linking species on the UKSI to recording_schemes, and are in discussion with Chris Raper about this becoming part of the UKSI. I'm not sure how this might be accommodated within Indicia but you can work on the basis that this will exist.

kazlauskis commented 7 years ago

As an alternative thought @japonicus has mentioned that it might be easier to have a giant survey that would present itself through a form where attributes are simply enabled or disabled depending on the taxa groups.

One way or another, my guess is that the form would have to request the taxon-specific form model (maybe something like this) on every taxon update, and that it will become very dynamic. Something like Angular or React could help here as these two must have lots of tools for such dynamic forms.

Here is an example of how Tom has done his dynamic recording form using description of occurrence model. Note: click on the green bits - it will open up different parts of the form.

kitenetter commented 7 years ago

Do we need to consider the split between location, sample and occurrence attributes? I can imagine that some attributes (date, location) will likely be needed for all forms, but others (including some sample attributes) might vary, e.g. method, or source of record or specimen (for those schemes that commonly deal with records from specimen collections). And I imagine that having variable sample attributes might be more problematic than having variable occurrence attributes?

I would also expect a requirement that even if several schemes wanted the same attribute, they might want to populate that attribute from differing termlists (again method is an obvious case).

We do have some consensus on a single data structure for all beetle recording schemes, but the consensus has only been sought from a subset of scheme organisers so far!

We could also usefully revisit previous discussions (e.g. #195 ) about how we can enable terms to be made active for data entry or not, in order to futureproof ourselves for the inevitable changes in approach over time.

DavidRoy commented 7 years ago

I think we should focus initially on occurrence attributes but keep sample (and location) attributes in scope. I agree about the need to consider #195

JimBacon commented 7 years ago

Responding to David:

Okay, if it is definitely recording schemes then we definitely can't use the taxon_groups.

Let us continue to consider the case of a survey per scheme, as Karolis first suggested. If we assume that, in future, there will be scheme/taxon data in the warehouse then what we will need is a scheme/survey look up. That could be a custom attribute on the survey.

The user selects a species from which we know the scheme. The warehouse is queried for surveys belonging to our website with the matching scheme attribute. Knowing the survey we then build a generic form dynamically. Alternatively/additionally, there could be a custom survey attribute containing a path to an existing recording form specifically designed for the scheme.

(Note that, for building the scheme/taxon data, the Aquatic Coleoptera scheme is a 'beetle in the ointment' as it covers species also in other schemes.)

DavidRoy commented 7 years ago

"Re: Alternatively/additionally, there could be a custom survey attribute containing a path to an existing recording form specifically designed for the scheme" Note that a potential requirement is for a data grid to allow species to be entered across recording schemes, with the attributes/termlists being made available dynamically. I realise this is potentially complex, so one option is to have a relatively fixed set of attributes (e.g. stage, status, sex) but the terms vary depending on the species selected.

johnvanbreda commented 7 years ago

Coming in late to the discussion, a few thoughts of my own: 1) Having a single record form that dynamically loads attributes from an appropriate survey definition might work. I don't see a multi-record grid working though. For example I might want to enter a plant record plus the bees that visit it and I would want these to belong to the same survey dataset, not 2 different ones. So I think the idea of a super-survey with a range of attributes picked from when you choose the species is more flexible in the long run. 2) Although taxon groups might not be an appropriate way to divide the attributes up, what about using the family (which is included handily in various cache tables so convenient and fast)? 3) Part of this requirement might be met by having a form where the attributes themselves remain consistent, but the termlists used to populate the lookup are switched according to the species chosen. So, you would have an abundance attribute which uses a different termlist depending on the species chosen. I'm not sure if this helps if dynamic attributes really are required but it might be easier to implement if the requirement is really dynamic drop down terms.

If the requirement is dynamic attributes then I could imagine an implementation being along the lines of: 1) Populate the groups table with a list of recording schemes (setting group type term appropriately). 2) Either use the group filter definition to list the appropriate families, or have a new families_groups table with a UI to populate it. 3) A new table occurrence_attributes_groups to join between groups (recording schemes) and their relevant attributes (plus one for sample attributes if we get that far) and a UI to configure it. 4) A big survey with lots of occurrence attributes. Dynamic ones are linked to their groups (recording schemes) and therefore to the families. 5) A configuration to enable dynamic attributes in the species checklist grid. 6) The species checklist then performs an attribute lookup when you pick a species, based on the family. Attributes with a shared system function could all load into the same column (e.g. abundance) whereas other more specific attributes would presumably need to go in the extra attributes row.

DavidRoy commented 7 years ago

That's a good summary. Having reflected on discussion, I think we should do the following in the first instance.

  1. Work at the family level since the relevant attributes will be fixed at this level of the hierarchy. Some families are split across recording schemes but this is rarely the case, and even when they are the attribute-terms can be the same.
  2. Have a 'super-survey' with a fixed set of occurrence and sample attributes. Could consider extending this with columns for the 'extra attributes' row
  3. Termlist changes dynamically as a species is selected
kitenetter commented 7 years ago

One question on the 'super-survey' approach: will this make it more difficult to download the relevant attributes for each taxon group? We have two routes to download data:

I think the first of those two points is the more important, as I suspect most users (including verifiers and LERCs) use the generic download most of the time.

kazlauskis commented 7 years ago

I think that such functionality that downloads all the attributes should be the default one. If I click to download all my records I would like to have all the data associated with each record without having to specify the survey (which I might not even know).

If there are empty attribute columns then simply don't include those in the download report.

JimBacon commented 7 years ago
  1. Could use families to determine use of attributes but it seems a shame not to use the scheme data if it is going to be present and it might lead to a simpler implementation.

  2. Agree with super-survey. The survey per scheme breaks down if you want to record species from different schemes in a single sample. Sample attributes will be fixed. Only occurrence attributes will be dynamic. (Dynamic sample attributes suggests sub-samples and probably an over-complicated user interface to me.)

  3. We could implement dynamic termlists in two ways a. A fixed attribute with a termlist containing all possible terms which are shown dynamically according to species. b. A number of attributes which are shown dynamically, each with a fixed termlist.

Option b can extend to support the extra attributes row while a does not. A solution could implement both a and b but it might be simpler to select one only.

Question. To what extent is this a general requirement or is it an iRecord only requirement?

DavidRoy commented 7 years ago

Re: To what extent is this a general requirement or is it an iRecord only requirement?

I think it's a general requirement (e.g. could be useful to LERCs) but my only current requirement is for iRecord so that should be the focus.

JimBacon commented 7 years ago

Thanks. That would confirm a solution in the warehouse is preferable to one confined to iRecord.

DavidRoy commented 6 years ago

We are at the point where we need to decide on (and implement) an approach for this.

Karolis has implemented dynamic attributes within the iRecord App. See demo with dynamic attributes for dragonflies and bryophytes (using taxon_group for dynamic element): http://irecord-app.herokuapp.com/#samples

The occurrences will be submitted under a super-survey which includes many attributes. The intention is that users will be able to edit records via the generic editing form on iRecord, e.g. https://www.brc.ac.uk/irecord/edit-generic-record?occurrence_id=6409756

This form will become unwieldy as we add in lots of dynamic attributes.

@johnvanbreda is it possible for the generic editing form to have a dynamic element, displaying attributes based on the taxon_group of the species? If so, how much work is involved and when could you fit this into your schedule

johnvanbreda commented 6 years ago

@DavidRoy I'm now about to tackle this as I have a similar requirement for another project (so can share the costs). The requirements are slightly different in how the attributes are going to be set up, but it will make a more powerful and flexible solution.

@kazlauskis can you let me know if anything I am planning below contradicts the way you've done this in your app demo please?

The summary of the proposed approach is that we will define taxon attributes that are linked to taxa in the species list data (presumably against UKSI for UK data) by inputting values for the attributes against the appropriate taxa. The attributes will act as templates from which we can automatically derive occurrence attributes and sample attributes where relevant. The values recorded against a taxon for an attribute will do 2 things - firstly, declare that this attribute is available for this taxon (and all its descendants). Secondly, the value (or range of values or multiple selection of terms for lookup attributes) define the validation rules that can be used when values are input for occurrences of this taxon. We will enhance the attribute values to allow ranges so the taxon attributes can be used to provide a range of possible values for the input occurrence data. We’ll also have to build a better hierarchical index of taxonomy in the cache tables so that it’s easy to grab all attribute data looking up or down the hierarchy.

A worked example might make this clearer:

  1. Create a taxon attribute called wing length (mm) - float.
  2. Tick a box to set a new flag “applies to occurrences”.
  3. Add a value for this attribute to the Insecta taxon. Either set a new special value “any”, or a range of possible wing lengths for all insects (e.g. 0.5 to 200mm).
  4. Add another value for this attribute to the Odonata (dragonflies) taxon, with a range set to 20mm - 120mm.
  5. Because the “applies to occurrences” box was ticked, when the attribute was saved an equivalent occurrence attribute will be created automatically by the system.
  6. Link the occurrence attribute to the “mega-survey”.

Now, when the user selects an insect species, a query will use the taxonomic hierarchy index to look for any taxon attributes attached to the selected taxon or any of it’s ancestors that also have an occurrence attribute linked to our survey. If any attribute is found multiple times, we’ll use the one attached at the lowest level. Therefore a non-Odonata insect will find the associated wing length attribute via the taxon attribute value 0.5-200. The input form can then auto-create the control and use the range 0.5-200 as a validation rule. If they picked an Odonata species then the validation rule will be derived from the taxon attribute value linked to Odonata, so the validated range is 20 to 120.

A taxon attribute could also point to a lookup list (e.g. life stages) meaning that the occurrence attribute values associated with this would be picked from the same term list and could also have the range of possibilities enforced by the range of options chosen for the taxon. So you could have a single term list with all life stage terms in it, then link different terms to different nodes in the hierarchy to define their availability. Note that because the taxon attribute value can be set to “any”, you can allow any term from the lookup list to be picked in the occurrence data where appropriate. This might be useful for longer lists of options, e.g. habitats, sampling methods.

Worked example for life stages:

  1. Create a term list for all known insect life stages and an associated taxon attribute (multi-value).
  2. Tick the box to say “applies to occurrences”
  3. Add a value for this attribute to the Insecta taxon, with the special value “Any”.
  4. Because the “applies to occurrences” box was ticked, when the attribute was saved an equivalent occurrence attribute will be created automatically by the system.
  5. Link the occurrence attribute to the “mega-survey”.
  6. At this point, for any record of any insect, the Insect Life Stage attribute is available.
  7. Add a 2nd value for this attribute to the Odonata taxon, and add multiple values, one per allowed life stage for dragonflies (e.g. Larvae, Adult etc).
  8. Now if a dragonfly species is input, the list of available options will be limited because the choice from lower down the taxonomic hierarchy (dragonfly life stage = egg/larvae etc) overrides the choice of “any” at the insect level.

Also note that you can set a flag “applies to samples” on the taxon attribute when defining things like the habitat that would be input at the sample level.

The idea above of linking this to recording schemes isn’t really necessary now - as a separate task we could link taxa to recording schemes which would give a list of attributes for each scheme, but it’s a separate requirement I think.

There will need to be some UI updates to make configuring all this easier, fortunately this is specified as part of the other project.

JimBacon commented 6 years ago

That sounds pretty reasonable to me.

Could you confirm that

Would another website be able to use the UK Master List and augment it with their own dynamic attributes?

A dynamic attribute which applies to sample would presumably create sub-samples in the record.

johnvanbreda commented 6 years ago

@JimBacon I had envisaged that the attributes would generally always be attached to UKSI (since it has a fairly reliable hierarchy) and that any list which has a TVK in the external key would then be able to match across to find the attributes. Even though the UKSI list might end up with lots of attributes attached from lots of websites, they would only be relevant to the survey datasets where the linked occurrence or sample attributes had been joined to the survey dataset (see point 5 in both examples above). It would be possible to keep attributes in separate lists I think, though the cost of this increase in flexibility might be an increase the complexity of some of the joins required to find the attributes. I think I've just confirmed your 2nd bullet point and the question about websites augmenting the UKSI attributes with their own.

The last question about sub-samples would not be necessary for single sample/record forms. For multi-record forms I think that in many cases you could attach the attributes to the same sample. For example lets say you were recording lichens with specific pollution tolerance and other requirements relating to the chemical conditions. One lichen might ask you to measure the pH, another might ask you to measure the sulphur content of the substrate. If they are in the same sample, then both these measurements can be simply attached to one sample. I suppose there may be other implementations where you want to create sub-samples to keep things separate, but I think that would be up to the implementation.

JimBacon commented 6 years ago

Glad I asked about how other websites create dynamic attributes. That was not the design I had imagined. It makes the UK Master List a special case of a taxon list. Will that work okay for other warehouses you know?

Sorry about the sub-sample question. I realise (after a lot of muddle thinking) that this just comes down to your experimental design of what a sample is and how fine-grained your measurements are. If you choose a design without sub-samples and you add two different species needing the same dynamic sample attribute then you only need to add it to the form once. Simple.

kazlauskis commented 6 years ago

Just to double check I understood it correctly. A taxon attribute would be defined in such (simplified here) way:

name: "wing length",
type: "float",
smp: [ ]
occ: [X]
taxa: {
  "Insecta": { value: "any"},
  "Odonata": { value: " 0.5-200"},
}

This would be linked to the occurrence attribute that is then associated to the 'mega-survey' shared between the iRecord website and the app.

 // what is the role of the occ_attr if we have the taxon_attr?
 taxon_attr <- occ_attr <- survey // ?

A scenario: A user has incorrectly recorded some insect species and to correct it now selects Aeshna caerulea in the dynamic form. The form would redraw itself to show the 'wing length (0.5-200)' attribute that the species parent (Odonata) has an association with.

"Animalia" : {
   "Euarthropoda": {
      "Insecta": {
           "Odonata": {
               "Aeshnidae": {
                   "Aeshna caerulea"
                }
            }
       }
   }
}

If this is how it is, then I like it, though it is different and poses some challenges for integrating it with the mobile apps. At the moment, the dynamic attributes in the iRecord App are flat - there is no hierarchy as the definitions of the attributes are directly associated to informal taxon groups and nothing else. If we have moved to your proposed idea then from the users perspective, all the submitted records that belong to the current app-survey would still be valid (?) and the user shouldn't notice much difference. For the app, on the other hand, it would require to move to using a hierarchical UKSI list within the mobile device, which is possible, but requires a bit more thinking and rewrite some highly optimised data structures. I will look into this and get back to you soon. Otherwise, as far as I understand it sounds powerful and a good idea.

DavidRoy commented 6 years ago

If I understand it correctly, John's proposal has the advantage of being flexible to enable dynamic attributes to be set at any level of the taxon hierarchy. What Karolis has implemented in the App sets the dynamic attributes at the taxon_group level on the expectation that most attributes are defined by National Recording Schemes.

My question is therefore whether the extra complexity is needed. I assume John's other work requires this.

johnvanbreda commented 6 years ago

@kazlauskis & @DavidRoy, I think your question is basically the same - is there actually a need to attach dynamic attributes at any level in the hierarchy, or would limiting the attachment at group level be sufficient? The solution currently implemented was funded by the DGfM (German Mycologists) and they do have the requirement to link attributes in a much more fine-grained way than at taxon group level. Typically attributes will be associated at the genus level and simply associating all attributes to "fungi" would defeat the purpose of the development. In fact they are also able to limit the associations to certain life stages, e.g. measurements of cap width for fruiting fungi only. This extra stage linking functionality can simply be ignored if not required. I suspect that a solution limited to linking attributes to taxon groups would meet the requirements in the UK only so far and that sooner or later we will find cases where additional control is required. Just as an example, recording "insect - hymenopteran" might need a prey attribute for wasps but attributes relating to flower visit/pollen collection for bees. Karolis - V2 (the develop branch which this feature in) has a cache_taxon_paths table with the hierarchical information required to make the querying for all attributes linked to a taxon or any of it's parents efficient.

kazlauskis commented 6 years ago

I am happy for this to be done in either way. I will very soon be blocked by this and so it would be good to have the edit form on the website asap. The app is now going to support even more taxon specific attributes and if we don't have a form ready for users to edit their records, the app's record editing on the website will be limited at best. Otherwise, it will be very confusing because the forms will have to be bloated to accept any possible argument, which our super-survey is holding at the moment.

If I understand this correctly, we can assign the same set of attributes to multiple taxa that would essentially constitute an informal taxon group? If so, then I am all good. Later syncing such survey config with the app might be a challenge, but at least it will get me going.

Thanks!

kazlauskis commented 6 years ago

I am happy to create a highly dynamic edit form myself - based on the front-end javascript rather than the php iform module. This isn't following how other forms are done right now, but might be a temporary solution to get us going.

johnvanbreda commented 6 years ago

Hi Karolis - I plan to have the dev server up and running with the latest code this morning - will that help? You can then try the new dynamic forms stuff out.

kazlauskis commented 6 years ago

Thanks John, yes, this would be great. I would be able to start working on it this weekend, so any time this week is good.

johnvanbreda commented 6 years ago

The dev warehouse is now running the v2 code.

kazlauskis commented 6 years ago

Great, thanks for implementing this @johnvanbreda . This is definitely getting some progress now. I have now tested the dynamic attributes for the iRecord App survey, which at the moment uses one big bloated survey that shows all the possible attribute on the website. In general, I like it very much, it is easy to link the attributes and link them on the edit form. The app survey is set up of a bunch of core attributes that all the species share, and then some are taxon specific. All of the attributes are in one survey, which allows us to load the different attributes depending on the taxon selected. There are some small issues and thoughts that came up to me while setting this up though.

This is the dynamic form for the app: link This is the current iRecord App survey: link

  1. At the moment, the in-app and the warehouse surveys require a manual sync. This I am doing manually and as everything manual can be error prone, it is worth having a think of how this could be more automated in the future. This is mainly due to supporting many taxon specific attributes, which might render the survey quite complex later. Exporting the survey might be a very good way to sync the surveys, but at the moment taxon specific attribute metadata is not visible, so this is something worth implementing at some point I think.

  2. Is there a way to disable an attribute down the taxon chain? This is for example, Animalia has general abundance attribute, but say butterflies has another attribute (different ID), and so the web edit form shouldn't show the general (Animalia) attribute in this case. Is there a way to do this at the moment?

  3. When adding occAttr:105 attribute to the survey, it throws errors when editing other taxon specific attributes:

screen shot 2018-10-07 at 12 31 27
  1. When editing a record on the website, changing species got me the error:
screen shot 2018-10-07 at 12 18 21
  1. Adding an abundance attribute (occAttr:16) to Animalia didn't load it for butterflies, had to attach it to Lepidoptera taxon specifically.

  2. Unselecting a dynamic value in dropdown select or removing a text-type-input value doesn't update it when reloading the form. For example, changing a non-dynamic atttribute, like altitude, works fine, changing Butterfly app abundance code doesn't.

  3. hookDynamicAttrsAfterLoad is undefined in:

    $.each(indiciaFns.hookDynamicAttrsAfterLoad, function callHook() {
    this($('.species-dynamic-attrs.attr-type-occurrence'), 'occurrence');
    });if (typeof jQuery.validator !== "undefined") {
    jQuery.validator.addMethod('customDate',
    function(value, element) {
      // parseDate throws exception if the value is invalid
      try{jQuery.datepicker.parseDate( 'dd/mm/yy', value);return true;}
      catch(e){return false;}
    }, 'Please enter a valid date'
    );
    }

Thanks again, looking forward to have this in production.

johnvanbreda commented 6 years ago
kazlauskis commented 6 years ago

1 & 2. thanks that sounds great.

  1. You can see the error here
    • Open a record to edit, like this one here. Masquerade as me if you can't access it.
    • At the moment attribute Ad is set to 1, try changing it to blank and saving the form
    • Open to edit the saved record (same link in step one). It should be now blank but is still set to 1
  2. To show up the map, I have added a fixing script line to the top of the form setup (ie. User Interface tab), it looks like this: <script> window.indiciaFns.hookDynamicAttrsAfterLoad = [] </script>, remove it to see the error in the console. I have tested it with latest Chrome and Safari.
johnvanbreda commented 6 years ago

Thanks @kazlauskis, 3, 6 & 7 are all fixed.

kazlauskis commented 6 years ago

Thanks, I will try this out this weekend. I think, I am happy with the current implementation and I am looking forward to this being deployed to live iRecord. I will keep this ticket open till then. Any thoughts on when the live warehouse will be v2?

johnvanbreda commented 6 years ago

@BirenRathod The v2 warehouse code may as well be deployed the same day as the server infrastructure change I think? There will be a couple of hours of scripts to apply at the same time for the upgrade. When would you like to plan this in? I am away the week starting 22nd Oct by the way.

BirenRathod commented 6 years ago

@johnvanbreda, We can do after you come back. @DavidRoy , any date in mind? I already got new server ready installed with Postgres 10. My only concern at present is that we need one whole weekend to copy all images and then following day we could put the server down for whole day and start taking database backup and rest of the images.

johnvanbreda commented 5 years ago

This functionality is available for single record forms. Do we need to implement for list of record forms as well?

DavidRoy commented 5 years ago

Is the single record form version live?

Ideally, we would have this for the list of records form - certainly for stage/sex/quantity terms, but may become unwieldy to add other attributes?

johnvanbreda commented 5 years ago

The National Trust enter a casual record form uses dynamic attributes for moth and bat species (https://www.brc.ac.uk/irecord/national-trust/enter-casual-record) but I think that's the only live instance on iRecord. This functionality is used quite a lot on the German Fungi recording site though.

For list of record forms, there are 2 possible implementations. Adding columns to the grid depending on selected species does indeed seem unwieldy even for just a few columns. But, maybe having a default sex & stage column where the control get's swapped to a different control depending on the selected species should work. Any additional attributes (i.e. attributes where the system function of the attribute does not match one of the existing columns) would then only be available in the 2nd row which appears when you press the + button. Do you think we need both options?

DavidRoy commented 5 years ago

I could not see the dynamic attributes working on the NT form, e.g. entering a moth name still has the plant stage attributes shown? nt pic

I agree on the approach for the list of records form. If easier, this can be done in stages with the sex/stage attributes done first?

johnvanbreda commented 5 years ago

The Sampling method control visible in the screenshot here is the only dynamic one - that's just the way NT specified the form as they didn't ask for the stage to change.

Still to do - species list dynamic swapping of system controlled attributes.

DavidRoy commented 5 years ago

Closing as the specific issue dealt with