CIRDLES / Squid

Squid3 is being developed by the Cyber Infrastructure Research and Development Lab for the Earth Sciences (CIRDLES.org) at the College of Charleston, Charleston, SC and Geoscience Australia as a re-implementation in Java of Ken Ludwig's Squid 2.5. - please contribute your expertise!
http://cirdles.org/projects/squid/
Apache License 2.0
12 stars 24 forks source link

Elementary Task handling #293

Closed sbodorkos closed 4 years ago

sbodorkos commented 5 years ago

At present, most of the options in the Task menu lack context. I appreciate that this is partly due to incomplete design/implementation, but there are ways to maintain the current functionality whilst making it significantly easier to use.

First (and maybe only?) item in the Task menu should be something like 'Task Manager', and its primary function should be to display a scrollable, alphabetised list of a user's Task Library (however that is defined), because being able to see this will give context to all the other current menu functions.

The current Task (if there is one) would be highlighted in the list, and a 'mini-viewer' adjacent to the scrollable list would provide a (non-editable) overview of the Task (basically, everything in the first 4 lines of the Manage Current Task screen), just so a user could browse their library and remind themselves of the differences between a bunch of Tasks with very similar names. (For context, my own SQUID 2.50 squiduser folder contains 73 active Tasks, and while that might not be typical, and/or be a consequence of poor 'housekeeping', it is still important to have a "Library browser" as the first stop in 'Task management'.

In that screen, you could implement the functionality of what are currently menu-items with buttons, which draw context from the displayed list. For example, before "Loading" (I think this should be "Importing") a Task from a Squid 3 Task XML, you'd like to be able to see your existing list, so you'd know whether you had already imported it on some previous occasion. Likewise, I would describe the process of generating a Squid 3 Task from a SQUID 2.50 Task as "importing". So the buttons I conceive are:

  1. Import Task from Squid 3 Task XML = dialog-box initiating addition of a new Task to the Library and list.
  2. Import Task from Squid 2.X Task (Excel) = dialog-box initiating addition of a new Task to the Library and list.
  3. Delete Task = deletes the selected Task from the Library and list (greyed-out if no Task selected).
  4. Task Designer ("Task Editor" might be more accurate) = open the selected Task in Task Designer; if none selected, open an empty Task Designer window. Contents should be non-editable on opening, at first (see below).
  5. Export Task to Squid 3 Task XML (greyed-out if no Task selected) = dialog-box initiating the Save of a Task as XML to some location?

Then, inside the Task Editor (which should ultimately represent the merged product of "Task Designer" and "Manage Current Task"), you could have another set of buttons, each of which unlocks the contents of the screen in some way:

  1. Edit Current Task = allows modifications of the existing Task, as per "Manage Current Task").
  2. Create New Task from Current Task = creates a new Task called "Copy of _Current_TaskName", and allows modifications of it.
  3. Create New Task from Blank could activate a series of otherwise-greyed-out "Templates for peaks and ratios" buttons, labelled "9-peak zircon GA", "10-peak zircon GA", "11-peak zircon GSC".
cwmagee commented 5 years ago

Is "Export Task" The same as "Save in Library"?

bowring commented 5 years ago

This is a great discussion. There are many underlying issues, however. For example, what exactly is a task in the task library? Say the task library is user-specific and the user selects a task from the library that matches their data file (at least on mass count) for use in a new project. The user then modifies the task in some way - say changes an expression. I argue that this is now a different task and if the user wants to save it to the library, they need to give it a new version number (we have not yet introduced a scheme for versioning tasks), and the suggested highlighting of the task in the list of tasks would disappear upon a modification in the task and then reappear highlighting the new version of the task if the user elected to save the new task to the library. Also, if the user removed the original task from the library, any other project using that original task would lose its context-highlighting ability unless we also provided that any Squid project loaded would automatically populate the user's library with the current task if it were missing and if it did not clash (name and version) with one already there. We do this currently for parameter models.

Let's start by working on some definitions:

Project - a file that contains (assume for now Geochronology) 1) its own data file (Prawn or OP, etc) independent of the original source data file, including any edits or deletions the user makes to spot names, etc within the project 2) selections for reference material and sample filtering into sub groups for the data file including delimiter 3) selections for parameter models to be used in the data reduction 4) instructions for preparing the data file : normalize counts for SBM and ratio calculation method, whether Squid will auto-reject when calculating means, choices for U and Th thresholds, 5) metadata about the project as in notes, provenance, analyst, lab, etc. 6) a task

Task - a specification for data reduction and reporting that lives within a project and that optionally can be exported / imported as a file, containing: 1) a list of masses that the task will use in specifying ratios and in defining expressions - including the required 204, 206, 207, 208, and BKG if used 2) a list of ratios made from the list of masses that the task will use in specifying ratios used in defining expressions - including the required 204/206, 207/206, 208/206 3) a set of custom expressions that may be empty, using the masses, ratios, names of parameters (values supplied on demand from models), functions, operations, etc. specified by Squid. Note that some expressions may have been designed to target specific subgroups of unknown samples in a given data file and these subgroups result from project-level filtering, so a task does not independently know about subgroups. 4) directives including expressions as needed for the four Squid horsemen #-1, -2, -3, -4 5) preferred index isotope 6) eventually, specifications for various visualizations and outputs 7) metadata about the task including type (Geochron, General), author, lab, provenance, etc.

Squid workflow: 1) A Squid project is the venue for first marrying a data file and a task, choosing the project-level specifications (2,3,4 above) and then executing the task's instructions

2) After marriage, the following are true: a) the names of the masses in the task (204, etc) are mapped one-to-one to the mass stations in the data file so that the expressions (expressed in terms of the task masses) can be evaluated using the appropriate data. b) the parameter models selected in the project have been used to populate the values of the parameter variables specified in all Squid3 tasks, such as lambda238, etc. c) the built-in expressions for the task have been updated to use the parameters and the directives of the task. d) the data file has been pre-processed arithmetically per the project's specifications and in preparation for the task

3) After marriage, the following are possible inconsistencies: a) the custom expressions are (un)healthy and/or (in)correct b) custom expressions designed for specific subgroups of unknown samples will not be mapped to the subgroups of samples in the current data file - the user will have to do this

I think that once we agree on some definitions for project and task, we can better address questions about what is a task in the library and what are the common possible workflows beginning with each of: 1) starting Squid3 and importing a data file into a new project ... workflow1, 2, ... 2) starting Squid3 and opening an existing project and modifying it by importing a new data file ... workflow1, 2, ... 3) starting Squid3 and opening an existing project and modifying it by importing a new task ... workflow1, 2, ... 4) etc.

This is why we say software engineering is harder than rocket science!

cwmagee commented 5 years ago

What are squid horsemen?

2a in workflow: Do masses in data and task need to be mapped 1:1, or does the data file merely need to have sufficient masses to populate the requirements of the task?

When starting a project, does it matter if you load data first and then the task, or the other way around? It shouldn't.

bowring commented 5 years ago

currently 1:1;

currently, data then task;

more general approaches can be proposed as future features once we have the basics done.

sbodorkos commented 5 years ago

These are interesting questions and issues.

Chuck, "4 Squid horsemen" = SQUID 2.50's "Special U-Th-Pb equations". The way we have implemented in Squid3: -1 = 206Pb/238U normalisation expression -2 = 208Pb/232Th normalisation expression -3 = 232Th/238U expression As per SQUID 2.50, a minimum of one and a maximum of two of these three expressions apply in any one Task (-3 is redundant is you specify both of -1 and -2, and is optional otherwise). -4 = Parent Element (usually U) concentration expression (independent of -1 to -3)

Task versioning: Having thought about this for a long while (and having written and then deleted a big pile of thoughts!), I can see the necessity, and I can also see practical problems.

Firstly, the Project definition is useful. Clearly in order to uniquely define a Project, you need fully specified values of all its components (including the Task), which means that the Task definition essentially folds into, and becomes part of, the Project definition.

This does imply the need to be able to uniquely identify the 'instance' of a Task that has been "snapshotted" during the save of a Project. Presumably there are existing mechanism to do this sort of thing, that stamp the entity with a serial number, timing of the latest Save etc. This gives rise to other issues relating to the unique identification of any given Task and its version (given that the Names of the Tasks can obviously easily be non-unique).

The practical issues relate to actual Task-handling in the Squid3 interface. As I said, I have 73 active Tasks in SQUID 2.50 where the versioning is human-controlled (i.e I institute a newly-named Task, usually as a variant of an existing Task, ONLY when I make a "significant" change). The thing is, a lot of minor tinkering goes on in Tasks, often just to get them "right", but sometimes as a manual iteration measure, where you want to achieve something that is beyond the scope of the conventional "once-through" Task mechanism. (An example is calculating a fThU value for the reference monazite in Richard Stern-style monazite Tasks: you run the Task once to obtain a biweight value of ["264/254"] in z8153, and having obtained that value, you plug it in to "horseman" equation -3 and re-run the Task.) If every Edit and Save were preserved, I'd have over 1000 Tasks! And the rub is, 900+ of them would be of no interest to me, because they would predate some Edit that I obviously deemed necessary. Their main value is that some of them might represent "snapshots" that were needed at some past time as part of the rigorous definition of a saved Project.

Perhaps there is some simple way to 'concertina' the versions, in recognition of the fact that 99% of the time, the one people want to work with is the latest version. The worth of previous versions would be in their relevance to old Projects (or previous versions of a current Project).

I guess it follows from this that a Task can't be Deleted (unless it can be established that it has never been used in a Project). Instead it would need to be an 'Archive' function, to take them out of circulation and stop them cluttering the Task Manager, whilst still retaining whatever is needed to support Projects that might have used them.

In summary, I guess I am in favour of Tasks being 'tagged' in some way, to enable rigorous matching to versions used in Projects (which will, of course, have 'tagged' versions of their own). But I wouldn't like to have heaps of automatically-generated versions of a single Task of that name cluttering up my Library-list; I would like to retain control over when I intend to make a change to an existing Task that is sufficiently significant that I designate it a new Task (with a new name).

cwmagee commented 5 years ago

Of your 73 tasks, how many are "duplicates" with all the same math but different mass stations? The need to rewrite tasks whenever anyone pops an extra peak in is one of the most frustrating things about squid 2.X. I shouldn't need 4 tasks all with exactl the same equations in them to look at Keith's old ILC study, but I do because even though the task only uses 9 mass stations, the data sets have between 9 and 12, and old SQUID is too stupid to be able to ignore mass stations it doesn't need.

sbodorkos commented 5 years ago

Workflow: Possibly we need to explore the full range of uses of the software. The way I see it, we need to ensure we rigorously accommodate "non-Project" use, by which I mean "Editing" usage, where one might open Squid3 in order to curate Parameter Models, or compose/modify a new Task, without necessarily having a specific Prawn XML data-file in mind. I would argue this "editing" could also be extended to Prawn XML handling as currently defined by the Data menu. What all of these "edit"-style activities have in common is that none of them require a Prawn XML file to be married to a Task.

For mine, a Project is "born" at the moment you attempt to marry one (or more, in a Join... scenario) Prawn XML file(s) to a Task, and I agree with Jim (and SQUID 2.50) that that process should start with the Prawn XML file(s) in order to narrow down the range of potentially applicable Tasks. As Chuck has implied, this would also be where we look at reconciling mismatches between the desired Prawn XML and the desired Task, particularly where the peak-set of the latter is a true subset of the peak-set of the former.

At the moment, the "marrying" is done in Isotopes & Ratios... Manage Isotopes screen, and I think that item should be moved into the Task menu.

I think we also need to look at the Ratio-definition functionality in the Task Designer, compared to that in Isotopes & Ratios... Manage Ratios screen, and decide which to persist with, as we probably don't need both. Regardless of which is chosen, it is most sensibly invoked directly from the Task Designer (ratios lack useful context without a Task, I think), so the Isotopes & Ratios menu can probably be abolished.

Note also that the expressions associated with the Directives in the Task Designer are encroaching on what was previously the domain of (part of) the Expression Manager - that needs to be looked at, too.

cwmagee commented 5 years ago

A couple of things about tasks:

  1. Currently, most tasks are edited versions of previous tasks. As far as I know, all the U/Th/Pb tasks here at GA are edits of a task originally written by Simon. Kathryn and I may have written some isotope ratio tasks from scratch, but I don't know if those have ever been used in anger on data which was ever actually published. Sometimes these edits are substantial; sometimes they are trivial. The need to match mass stations in the data file and the task is a major factor in requiring trivial edits (e.g. inserting a peak in the task that is never used, simply because it exists in the data file).

  2. More people use tasks than edit them. Of the dozen or so people in GA who have used Squid 2 to reduce data, maybe a third of us have actually edited our tasks. Most people simply use a task edited by someone else. In terms of quality control, this is a good thing- ideally everyone at a given institution will use the same task when trying to accomplish the same thing.

  3. The clunkiness of Squid 2.5 task editor, and the limitations in types of data input, are barriers to innovation. Because it is hard to write tasks, it requires a lot of effort to build new methods of reducing data. Without a way to reduce data, there isn't much point in collecting it.

  4. Where should the line be between tasks and options? Of the 32 tasks I have in my current library, a number are basically the same task, except for the slope of the calibration line (e.g ,slope 2 vs slope 1.8). Does it make sense for these to be different tasks, or is it better design to merge them into one task with different option settings (e.g. the calibration slope)?

cwmagee commented 5 years ago

On a more practical level: For the assigned Pb/U external 1sigma % err for uranium and thorium: The user really needs to be able to type in a number here. mucking around with the arrows is supremely annoying. The reference mat models need to be openable, and being able to make new ones would be handy. If this needs to be constrained- e.g. dropdowns consisting of published data properly attributed- we can help with that.

sbodorkos commented 5 years ago

You can make RM models yourself: it's not completely straightforward, but it is possible. For SQUID 2.50-style functionality, tick the "Apparent Dates" checkbox in your new model. Go into Resorces... Parameter Models... Reference Material models and have a go at it.

The non-editable RM models are by design: they are intended as credible "set in stone" sources. If you want a variation, you can make a copy and edit that.

Note that once you have constructed the RM models you want, you can go into Task... Task Designer and designate your own models as the Default RM models - got that tip from @bowring last night and it has saved me a bunch of trouble!

NicoleRayner commented 5 years ago

When I make new reference models, I do it from the Data - Spot Naming - View RM Model window. I have never understood though why "create a new model" (aka edit an empty model) is under Edit, not under File - seems very counterintuitive (same in ETREDUX I think). What do folks (@sbodorkos, @cwmagee ) think of moving this to a command under "File"?

image

cwmagee commented 5 years ago

Sounds sensible to me. -C

bowring commented 5 years ago

Folks - It would be nice if separate issues existed for separate concerns - any volunteers to separate out these issues? Once Simon has finished the basic Squid math for unknowns and we have tested it, we will be ready for a UI-design brainstorming session.

NicoleRayner commented 5 years ago

I'll look after breaking this out into separate concerns - to help inform later UI design discussion.

bowring commented 4 years ago

All - It is time to revisit this conversation. @NicoleRayner - are you still willing to try to break out some issues? The new version (1.5.0) is due out in a few days and will hopefully inspire you with some of its changes.

bowring commented 4 years ago

This issue is addressed in v1.5.10 - any additional suggestions should be made via targeted issues - thank you