History-Research-Environment / HRE--History-Research-Environment

Main repo for HRE code
https://historyresearchenvironment.org/
GNU Affero General Public License v3.0
32 stars 6 forks source link

Code template and doc for Substitutions #25

Closed MichaelErichsen closed 6 years ago

MichaelErichsen commented 6 years ago

Robin:

here would a need to implement an agreed set of APIs to the Substitution area and they would used in the skeleton a crude placeholder module was included. That would be replaced by the full one once it had been reasonably test in its sandpit. The Substitutions area would rely on another agreed set of APIs, to built by Michael to prove the storage of the template elementes and the database retrieval of values that Laney's code wanted to use, but at the start that can be ghosted in a sandpit.

THE PROPOSED TASK FOR LANEY (Overview - more detail later if agreed)

In TMG there are namestyle templates and sentences templates where a markup means to substitute that mark with a user data derived text string. The same concept applies in HRE with a more complete set of operators and larger environment from which retrieved values may be sourced. In HRE, the use of this mechanism is extended to error and warning messages, filter definitions, sorting definitions, memos with embedded Citations, selection of columns for tabular output and later in other types of report generation.

In HRE jargon this is the Substitutions area. The definition of and the evaluation of substitution templates becomes the core that provides the flexibility for the user to configure HRE to match their workflow needs. The substitutions area directly involves about 14 database tables and some auxiliary tables for each use. case.

The task would involve the GUI to create/edit/store/evaluate the template an the code required to manage the associated database tables (via APTs defined with Michael). In the process Business Rules level would need to provide validity checks for definition editing and execution.

Much of the preloading of the lookup tables in the database and the ability to store user data for any manual or importing from TMG will rely their being agreed structures and methods from the substitution area.

INTERACTIONS with the team MICHAEL (1) to continue with the structure of skeleton including GUI and other BR modules as required

(2) with LANEY jointly to define APIs for evaluation of templates for each type of use of Substitutions

(3) with Laney jointly defined APIs for the storage and retrieval of data related to definition and use of Substitutions.

LANEY (1) to build the GUI for defining/editing/managing/validating any use of Substitutions

(2) code to evaluate substitution templates and return the content to the requester

ROBIN (1) to provide additional documentation on the internals of Substitutions and their use cases.

(2) This documentation is currently being drafted to augment what has been already placed on GitHub

(3) To liaise with MICHAEL and LANEY to ensure the intention of the data model is achieved. It is quite likely that minor changes in the field set of some tables may need adjustment.

INTEGRATION

When LANEY's code has be tested in isolation and seems robust enough, then it can be integrated into the HRE skeleton.

There are other important tasks, (a). dates and time intervals (Java), (b). field and record validation rules (Jython) that can start in the same way if others wish to be come involved.

MichaelErichsen commented 6 years ago

The mock-up contains a functioning separation between data access, business logic, communication logic, and user interface, which is already implementing both running client/server and single-machine, implementing the thoughts of Nils, and which have been tested by him. It will of course need refinement along the way.

I will build a template for Laney and rewrite the documentation, so you can write your part to be plugged into the skeleton. If I am lucky it can be done in about a week's time.

RobinLamacraft commented 6 years ago

Hi Michael and Laney,

I am pleased that Michael thinks that my proposal is a good idea!

I am meeting Rod and Don in Melbourne at the end of next (21 and 22 June) to review the priority of tasks here. Don and I are also meeting some anthropologist researchers working in a first nation law firm. They use TMG as do some other similar researchers in other companies in that field.

One of the items on the agenda is to discuss what external data (not in the HRE H2 database) needs to be be accessible to the substitution templates. Currently I have listed 3 files for the Client and 3 similar but different file for the Server. They are (labels for convenience of discussion):

CCE – the Client Common Environment local file (XML) Installation data and status

CUE– the Client Users Environment file (XML) User population properties and status

CPE – the Client Projects Environment file (XML) Project external properties and status of use

SCE – the Server Common Environment (XML) Installation data and status

SUE– the Server Users Environment file (XML) User population properties and status

SPE – the Server Projects Environment file (XML) Project external properties and status of use

We wanted to start to identify what data needs to be held in these 6 files and then the operations that would be needed to maintain these environment file and hence an API definition for each.

No doubt Michael will also know of other data that we have not considered.

Robin

On 13-Jun-18 03:29 PM, Michael Erichsen wrote:

The mock-up contains a functioning separation between data access, business logic, communication logic, and user interface, which is already implementing both running client/server and single-machine, implementing the thoughts of Nils, and which have been tested by him. It will of course need refinement along the way.

I will build a template for Laney and rewrite the documentation, so you can write your part to be plugged into the skeleton. If I am lucky it can be done in about a week's time.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/History-Research-Environment/HRE--History-Research-Environment/issues/25#issuecomment-396822864, or mute the thread https://github.com/notifications/unsubscribe-auth/AVeLtE5rmYttDRdUuqZUcKMU13zcUTU9ks5t8KpDgaJpZM4UllEv.

-- Robin Lamacraft, Adelaide, Australia


This email has been checked for viruses by Avast antivirus software. https://www.avast.com/antivirus

MichaelErichsen commented 6 years ago

Just a general comment on data formats, which in and around HRE could be XML, JSON, H2, CSV, unstructured, or Java/Eclipse preference files.

We have earlier discussed preferring JSON to XML as the trend has clearly gone this way.

For properties/preferences in the HRE application the internal representation is a properties/preferences file, which is maintained by the Eclipse platform. From the user side it is maintained in Preferences pages. So the operations and API is already provided by the platform:

http://www.myerichsen.net/HRE/Shot1.png http://www.myerichsen.net/HRE/Shot2.png http://www.myerichsen.net/HRE/Shot3.png

Definitions to be imported into HRE could be any format appropriate to other systems producing them.

That could be a installation configuration file, perhaps?

This what the HRE skeleton produces right now:

CSMODE=DIRECT DBNAME=C\:\Users\michael\Test1 H2TRACELEVEL=INFO HELPSYSTEMPORT=8000 LOGLEVEL=INFO SERVERADDRESS=127.0.0.1\:8000 SERVERPORT=8001 TLS=true UPDATESITE=http\://www.myerichsen.net/HRE/update USERID=sa eclipse.preferences.version=1 project.0.lastupdated=2000-01-01 01\:01\:01 project.0.localserver=LOCAL project.0.name=HRE project.0.path=c\:/client/temp/HRE project.0.summary=This is the default project project.1.lastupdated=2018-06-12 12\:49\:09 project.1.localserver=LOCAL project.1.name=Test 1 Name project.1.path=Test1.h2.db project.1.summary=This project is the first test project, which might be useful - or perhaps not projectcount=2

The preferences are persisted by Eclipse in

\workspace\.metadata\.plugins\org.eclipse.core.runtime\.settings\org.historyresearchenvironment.client.prefs and are not expected to be read or written by the user. Michael
RobinLamacraft commented 6 years ago

Hi Michael and Laney,

There are several issues that I am working through related to the Substitutions area:

(1) What style and sets of parameters are needed in the APIs to initiate the execution of a Substitution, so that each API is targeted to a specific Substitution outcome.

(2) Documenting the different outcome types of the Substitutions area with examples.

(3) What kind of user GUI and associated BR APIs should be available to create or modify a Substitution definition.

(4) How to store the suggested 6(?) external environment files? I agree that JSON looks like a more straight forward and compact representation.

[NOTE 1:The JSON parser has a value length limit of 8192 characters and this will be OK for some uses, but not all. The 4MB max file size of JSON would be insufficient for some uses, but would suitable for these external environment files. JSON's parser has 8 restricted keyboard characters, also Unicode characters that need backslash eliding. Management of those characters for data environment storage is an overhead as that eliding is not required within the HRE database strings. (Should it?) ]

There are similar issues when using XML, so where the constraints of field length and total files size are within the JSON constraints then JSON would be preferable.

[NOTE 2:  It is likely for bulk data importing and exporting that XML would be the preferred format when it is likely that the processing may exceed the JSON limits. Hence HRE will need a parser and an editor for both JSON and XML formats in the released product.]

(5) Documenting what data values should be managed and retrievable from the 6 external environment files.

(6) identifying what (if any) in the Eclipse application resource file may need to be accessed by substitution (or cached in a user readable space)

Regards,

Robin

On 13-Jun-18 03:28 PM, Michael Erichsen wrote:

Robin:

here would a need to implement an agreed set of APIs to the Substitution area and they would used in the skeleton a crude placeholder module was included. That would be replaced by the full one once it had been reasonably test in its sandpit. The Substitutions area would rely on another agreed set of APIs, to built by Michael to prove the storage of the template elementes and the database retrieval of values that Laney's code wanted to use, but at the start that can be ghosted in a sandpit.

THE PROPOSED TASK FOR LANEY (Overview - more detail later if agreed)

In TMG there are namestyle templates and sentences templates where a markup means to substitute that mark with a user data derived text string. The same concept applies in HRE with a more complete set of operators and larger environment from which retrieved values may be sourced. In HRE, the use of this mechanism is extended to error and warning messages, filter definitions, sorting definitions, memos with embedded Citations, selection of columns for tabular output and later in other types of report generation.

In HRE jargon this is the Substitutions area. The definition of and the evaluation of substitution templates becomes the core that provides the flexibility for the user to configure HRE to match their workflow needs. The substitutions area directly involves about 14 database tables and some auxiliary tables for each use. case.

The task would involve the GUI to create/edit/store/evaluate the template an the code required to manage the associated database tables (via APTs defined with Michael). In the process Business Rules level would need to provide validity checks for definition editing and execution.

Much of the preloading of the lookup tables in the database and the ability to store user data for any manual or importing from TMG will rely their being agreed structures and methods from the substitution area.

INTERACTIONS with the team MICHAEL (1) to continue with the structure of skeleton including GUI and other BR modules as required

(2) with LANEY jointly to define APIs for evaluation of templates for each type of use of Substitutions

(3) with Laney jointly defined APIs for the storage and retrieval of data related to definition and use of Substitutions.

LANEY (1) to build the GUI for defining/editing/managing/validating any use of Substitutions

(2) code to evaluate substitution templates and return the content to the requester

ROBIN (1) to provide additional documentation on the internals of Substitutions and their use cases.

(2) This documentation is currently being drafted to augment what has been already placed on GitHub

(3) To liaise with MICHAEL and LANEY to ensure the intention of the data model is achieved. It is quite likely that minor changes in the field set of some tables may need adjustment.

INTEGRATION

When LANEY's code has be tested in isolation and seems robust enough, then it can be integrated into the HRE skeleton.

There are other important tasks, (a). dates and time intervals (Java), (b). field and record validation rules (Jython) that can start in the same way if others wish to be come involved.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/History-Research-Environment/HRE--History-Research-Environment/issues/25, or mute the thread https://github.com/notifications/unsubscribe-auth/AVeLtNo7xRIpUN64YIa6qK_QLQ1u_HCUks5t8KoegaJpZM4UllEv.

-- Robin Lamacraft, Adelaide, Australia


This email has been checked for viruses by Avast antivirus software. https://www.avast.com/antivirus

MichaelErichsen commented 6 years ago

Hi Robin

I am not aware of such a size limitation in JSON. Is there anything which I have failed to see? The JSON support in the mock up is coded using org.json Java classes, and I don't find any limitations there.

Br. Michael

RobinLamacraft commented 6 years ago

Hi Michael,

I did search the Internet and found some comparisons of JSON and XML that implied that there were some constraints on JSON parsing. I would be pleased if the JSON field value maximum length was some what greater than 8192 characters and the total JSON file size was greater then 4MB.

All good - may be these sites were either out of date or only applied some particular implementations. I am just being couscous - it is better to find such things out before one commits to them.

The extreme use of one of these (JSON or XML) formats in HRE would be to dump an entire very large HRE H2 database as a machine-readable form of application-neutral archive backup file.

Robin

On 14-Jun-18 03:14 PM, Michael Erichsen wrote:

Hi Robin

I am not aware of such a size limitation in JSON. Is there anything which I have failed to see? The JSON support in the mock up is coded using org.json Java classes, and I don't find any limitations there.

Br. Michael

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/History-Research-Environment/HRE--History-Research-Environment/issues/25#issuecomment-397177929, or mute the thread https://github.com/notifications/unsubscribe-auth/AVeLtAKGzqA08w_qVGqzQos1zA0XWhZTks5t8fhVgaJpZM4UllEv.

-- Robin Lamacraft, Adelaide, Australia


This email has been checked for viruses by Avast antivirus software. https://www.avast.com/antivirus

MichaelErichsen commented 6 years ago

Hi Robin

For this purpose I would choose .csv (comma separated values). First, it is directly importable in spreadsheets and other databases, and secondly, H2 includes native functionality to import and export csv. I have already implemented that in the table csv import and export functionality in the DBADMIN and now in the skeleton.

Code samples from https://github.com/History-Research-Environment/HRE--History-Research-Environment/blob/develop/HRE--History-Research-Environment/org.historyresearchenvironment/bundles/org.historyresearchenvironment.client/src/org/historyresearchenvironment/databaseadmin/parts/H2TableNavigator.java:

final Csv csvFile = new Csv(); csvFile.setFieldSeparatorWrite(","); csvFile.write(fileName, rs, "UTF-8");

and

final H2TableProvider provider = new H2TableProvider(tableName); rowCount = provider.importCsv(fileName);

Br Michael

RobinLamacraft commented 6 years ago

Hi Michael,

If you are suggesting the use of CSV format for the 6 environment property and status files then this may be a problem because I was looking a format that could easily allows for up to 3 levels of nested information to reduce duplication.

I know that this could be overcome by using more files, one for each cross-product between pairs of controlling margins. But that would lead to potentially more consistency checking as compared with the more normalized nested structure.

Robin

On 14-Jun-18 05:01 PM, Michael Erichsen wrote:

Hi Robin

For this purpose I would choose .csv (comma separated values). First, it is directly importable in spreadsheets and other databases, and secondly, H2 includes native functionality to import and export csv. I have already implemented that in the table csv import and export functionality in the DBADMIN and now in the skeleton.

Code samples from https://github.com/History-Research-Environment/HRE--History-Research-Environment/blob/develop/HRE--History-Research-Environment/org.historyresearchenvironment/bundles/org.historyresearchenvironment.client/src/org/historyresearchenvironment/databaseadmin/parts/H2TableNavigator.java:

final Csv csvFile = new Csv(); csvFile.setFieldSeparatorWrite(","); csvFile.write(fileName, rs, "UTF-8");

and

final H2TableProvider provider = new H2TableProvider(tableName); rowCount = provider.importCsv(fileName);

Br Michael

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/History-Research-Environment/HRE--History-Research-Environment/issues/25#issuecomment-397199737, or mute the thread https://github.com/notifications/unsubscribe-auth/AVeLtLq9VX5aahCMJ3q0I4qbu4pjTNYxks5t8hFKgaJpZM4UllEv.

-- Robin Lamacraft, Adelaide, Australia


This email has been checked for viruses by Avast antivirus software. https://www.avast.com/antivirus

MichaelErichsen commented 6 years ago

Hi Robin

I only suggested csv for database and table import and export, using H2 features. For properties I would suggest java properties files. As you can see above I am using two levels for project definitions. Such a naming scheme would also fit three levels.

Br , Michael

MichaelErichsen commented 6 years ago

Have started pulling the parts for a sample together. Added a menu item to start the embedded HRE server. Added a sample H2 VIEW definition in the NewDatabaseProvider class.

MichaelErichsen commented 6 years ago

Class SampleView in package org.historyresearchenvironment.dataaccess generated from H2 VIEW using JPA.

MichaelErichsen commented 6 years ago

Finished and uploaded to Github.

MichaelErichsen commented 6 years ago

Will be uploaded as build 0.1.0.201807152107

MichaelErichsen commented 6 years ago

Closed with build 0.1.0.201807160730.