FreeUKGen / MyopicVicar

MyopicVicar (short-sighted clergyman!) is an open-source genealogy record database and search engine. It powers the FreeREG database of parish registers, the FreeCEN database of census records, the next version of FreeBMD database of Civil Registration indexes and other Genealogical applications.
46 stars 15 forks source link

Preparatory Discussion on FROTT (FreeReg Online Transcription Tool) #2472

Closed edickens closed 2 years ago

edickens commented 3 years ago

See #2329

Recent discussion in the Transcribers mailgroup has prompted me to think more about FROTT, which is becoming more urgent.

May I suggest that some Actions are added to Test3 (and Test2?) now in preparation for the FROTT programmer to test their work.

What I am proposing here should be easily understood by a transcriber as all that is happening is their files are being held on the server as "In progress" and not on their computer.

When working with FROTT, after logging in the transcriber needs to select "Batches".

Any file being worked on using FROTT will be held as "In progress". A way of holding these files separate from the main files is needed.

A Button along the top of the "your files" screen is needed for FROTT. Or perhaps, as it is so important, a new line above the "Upload new file" for... "To Amend an existing file using FROTT: find the file in the list and click its FR option. To start a new file using FROTT click this Start new file" ....if selected, at present this should bring up "Not available yet", but it will enter FROTT's create new file option when ready..

Against each listed file there needs to be a new Action (say FR) which is the choice to "Amend an existing file using FROTT". It should now create a copy of the existing file in the set of "In progress" files, checking for duplicate filenames, then say "Not available yet".

Also add a new line below the "FROTT Start new file" for files which are part complete... "To continue transcribing a file using FROTT click this Enter FROTT" ...this will list the "In progress" files then on finding the file selecting the Acton FR loads FROTT, but for now should say "Not available yet".

FROTT should create new files in the "In progress" set of files.

Also, having select the Enter FROTT and listed the "In Progress", an Action (DE) to delete any file and these do not need backups in the Attic but will need "Are you sure". This Action can be operational now as the Action to amend a file using FROTT will create an "In progress" copy.

Perhaps there should also in the list of "In progress" be a Download (DL) option should the transcriber want to work on the file offline with Excel and then use the existing replace screens.

When transcribing with FROTT is complete, then the file will need to be Uploaded or Replaced from within FROTT using the existing screens and deleted from the "In progress" set.

With this system, FROTT does not need to interface with the database, other than to find Place, Church and Register details. It will never need to make changes to the database. This should mean that FROTt can be written in the most suitable language for creating something like WinREG.

With these Actions the FROTT programmer should be able to test their work in Test3.

All existing file processing screens should remain for transcribers who wish to work offline or cannot work online due to a poor internet connection.

Thoughts? There may be other ways of doing this, but I think transcribers will easily understand this system. It looks very similar to what they do now. Eric

SteveBiggs commented 3 years ago

My initial thought would be to separate FROTT completely from existing batches on FreeREG and have a new button under 'Actions' called something like 'Online Transcription Tool'. Under this action button would be many of the actions Eric describes such as - 'Start New File', 'Continue Saved File', 'Download File' (for offline transcription), 'Upload File' (to continue transcribing an offline file using FROTT), etc. To "upload" from FROTT to the FreeReg database, there are two options; 1. a button under this FROTT area where you select the file and upload or replace, or 2. Under Batches when you select 'Upload New file' or 'Replace File', you can either browse your computer or browse the files in the FROTT in progress list and choose the one you want.

Needs discussion .....

Sherlock21 commented 3 years ago

I am inclined to go for Steve's version - ie a Transcriber logs on, then accesses their set of one or more jobs in progress, vie a button under My Actions, and there, they pick the required transcription from an index of File Names AND file name in words just to make it clearer. And they would then do everything in that sub app. including identifying the file to be uploaded or reloaded as appropriate. Then move back to FR as it presently is, to set off the Process Batch as now.

This seems a clearer partition of work, so the user knows where they are working. ALSO, to make it even clearer, have the screen look different to the FR current screen?

Sherlock21 commented 3 years ago

I should have added to my item above:

This FROTT method must be an alternative way of transcribing. Those Transcribers who wish to do so, must be able to operate exactly as at present.

edickens commented 3 years ago

Steve, I stongly disagree with you that FROTT should be separate from the other transcribing actions. We could have a new button "Transcribing" and there would be access to FROTT and uploading of files created offline. But that is a whole lot more programmnig and would move the existing link to uploading from where it is now. I went for the least change because people do not like change and so would stick with what they do now. Also, in the speciication for FROTT, we have asked that on Close or Exit that FROTT asks "Have you finished" and if Yes then it takes you direct to the upload screen with the filename ready. If you rely on them going back into the list of files and choosing which one to upload it will bring in errors and is more work. Eric

edickens commented 3 years ago

Eric B, Yes, the existing method of offline data entry must continue because there are those who have poor internet connections or have to pay by the amount of data transmitted. FROTT will create a lot of internet traffic.

I have put FROTT in with the existing screens used when transcribing. The least change possible. I do agree that it is not obvious at the top level which Action to use for transcribing and Batches should not be confused with Files. However FROTT, like WinREG, will create a Batch because the same Place, Chruch and Register Type will be througout the file.

If we do decide on a new top level button, it could be called "Transcribing". In this case I would like the link to uploading which is used when transcribing to be move here from "Batches". The Batches button would be just that, to list the batches. But I still think transcribers will end up swapping between top level buttons because the "Transcribing" aciion would just be "In progress" files. What it they want to go back and check an existing Batch which would not be listed under this button? It is easy and simple to have it all under the existing "Batches", or a new name to indicate that here is the list of batches and links for transcribing. Eric D

Sherlock21 commented 3 years ago

I agree with your underlying need - viz: to replace WinREG with something that can be updated as required. and so long as you cater for those who wish to work with Spreadsheets on their own PC, and upload as they do now, then I dont have anything else to add.

SteveBiggs commented 3 years ago

Steve, I stongly disagree with you that FROTT should be separate from the other transcribing actions. We could have a new button "Transcribing" and there would be access to FROTT and uploading of files created offline. But that is a whole lot more programmnig and would move the existing link to uploading from where it is now. I went for the least change because people do not like change and so would stick with what they do now. Also, in the speciication for FROTT, we have asked that on Close or Exit that FROTT asks "Have you finished" and if Yes then it takes you direct to the upload screen with the filename ready. If you rely on them going back into the list of files and choosing which one to upload it will bring in errors and is more work. Eric

But there are no other transcribing actions under Batches unless you call "Upload new File" a transcribing action. All the rest are listing files and looking for errors of zero years.

I consider FROTT to be a completely new application on FreeREG for creating CSV files to be loaded into the FreeREG database. That's why I believe it should be under a new top-level action. I don't see why you need to move the existing Upload button at all - this is for uploading offline files. FROTT will need it's own "Upload" button anyway because it's not looking for offline files. You can still have the "Have you finished" prompt and the rest you mention so no more errors or work.

Sherlock21 commented 3 years ago

I totally agree with you, Steve. I cant help wondering: has there been a statistical audit done of which transcribers use which system, and how many will use FROTT for the whole job as opposed to how many will use other means for their transcribing before having something to upload?

SteveBiggs commented 3 years ago

Yes, in the interests of avoiding confusion, it would seem better to me to leave the Batches area and its Upload button untouched so offline transcribers see no difference at all.

If they make the positive decision to change to FROTT, they would accept it is a completely new feature under its own button.

edickens commented 3 years ago

Eric B. There has not been a survey, but from the transcribers mailgroup we know that a lot use WinREG but are moving away from it because it doesn't handle flexible CSV, has problems with Windows 10, might not work with Windows 11 and Jo trains new transcribers on it. And we cannot update it.

SteveBiggs commented 3 years ago

Yes, we need a WinREG-like program that handles flexible CSV which of course, is what FROTT will be. I think we all agree on that.

PatReynolds commented 3 years ago

FROTT (or rather, the FreeREG iteration of the UniTT (universal transcription tool) can be used online or offline.

edickens commented 3 years ago

Hi Pat, How can it be used offline if the each entry created is saved on the server and not the transcriber's computer? Eric

SteveBiggs commented 3 years ago

It has to be able to be used offline for those who have a poor internet connection but still want the advantages the tool will give over a standard spreadsheet. When used offline, the files will obviously have to be held on the transcriber's computer but the folder could be synchronised with a "mirror" file system online so that when you connect to the online tool, the files are synchronised between the computer and the tool. Then you could continue transcribing either online or offline and the files sync next time you connect. This would also have the advantage of an auto-backup of the files.

edickens commented 3 years ago

This is getting a bit out of my level of programming expertise, but sounds like a really good idea. If I have understood it correctlhy, then the transcriber's "In Progress" files need to be synchronised with the "In progress" files on the server. So my reason for preparing for FROTT is even more urgent. The "In progress" file storage system needs to be ready for the programmer. It involves changes to the website system which Vino will have to do.

Sherlock21 commented 3 years ago

If one source is synchronisation with the other, you will have to pre determine which is the master. At the start, the Transcriber's file is the master ( obviously) but after that would you want the FR file to take over? But the Transcriber may well have done some more transcribing off line so you want to have that the master then too. BUT if the Transcriber has lost or wiped their file, then you need to recover from the FR Master.

edickens commented 3 years ago

And FROTT needs to be independent of the operating system, so has to be opened from the server each time it is used. It cannot be saved on the transcriber's computer. WinREG is about 14.3MB with 5MB of this being the Help, so not too large. The important thing is not to program in too many graphics. The Help can be on the server.

PatReynolds commented 3 years ago

@Sherlock21 this functionality is already built in to FreeCEN2.

PatReynolds commented 3 years ago

For clarity: the FreeCEN2 system is still entirely and offline transcription system. Their equivalent of our Registers is Pieces. The pieces can be massive, so the system has been set up so that someone can upload a part piece, and it can go through the quality control checks and be made live, without waiting years for the whole piece to be finsished. The quality control checked part-piece can be downloaded so that the errors which have been corrected do not have to be re-corrected. format is CSV throughout.

Sherlock21 commented 3 years ago

It seems to me that its pointless for dissenters making any queries or comments as they just seem to get ignored - ref my post of 2 days ago.

SteveBiggs commented 3 years ago

It seems to me that its pointless for dissenters making any queries or comments as they just seem to get ignored - ref my post of 2 days ago.

@Sherlock21 EricD replied to your question about a statistical audit and I then commented on his reply. You're not being ignored at all.

SteveBiggs commented 3 years ago

If one source is synchronisation with the other, you will have to pre determine which is the master. At the start, the Transcriber's file is the master ( obviously) but after that would you want the FR file to take over? But the Transcriber may well have done some more transcribing off line so you want to have that the master then too. BUT if the Transcriber has lost or wiped their file, then you need to recover from the FR Master.

  • Not so easy to organise when you get down to the nitty gritty.

This type of online/offline file sync arrangement is quite common and the usual protocol would be that the version of any given file which has changed becomes the master. If there are conflicts where both the online and offline versions have changed, you should be given an option to choose which you want to move forward with.

PatReynolds commented 3 years ago

I am going to check the details of how the FreeCEN2 system (called CSVProc) handles the primary / secondary file situation in a way that both allows the transcriber to edit (e.g. in finally working out what a surname is, many pages on) and the first parts uploaded to be proof-read, have validation checks and be published.

Sherlock21 commented 3 years ago

Steve: I did not receive a copy of EricD’s reply that you refer to.

I pick up on points that I feel I can contribute to in the emails that I do receive. I dont have the time to go hunting.

Eric B On 29 Jul 2021, at 15:53, Stephen Biggs @.***> wrote:

It seems to me that its pointless for dissenters making any queries or comments as they just seem to get ignored - ref my post of 2 days ago.

@Sherlock21 https://github.com/Sherlock21 EricD replied to your question about a statistical audit and I then commented on his reply. You're not being ignored at all.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/FreeUKGen/MyopicVicar/issues/2472#issuecomment-889213777, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACHPJAD7FMRVAHR5QFHFD6LT2FTNTANCNFSM5A6IRNVA.

SteveBiggs commented 3 years ago

@Sherlock21 I also sometimes don't get emails when a reply is made to a GitHub issue, but you can easily check the latest state of the conversation by clicking the link at the bottom of one of the emails you did get: image

No hunting needed :-)

PatReynolds commented 2 years ago

@PatReynolds to check that all the above has been captured in the Product Defnition Document

PatReynolds commented 2 years ago

Checked: all universal features have been captured. Some specifications which seem to be FreeREG specific, and which will be needed to be separately for all three services, have not yet been included (as far as I can see, but may well be mistaken - @Irene-ene can you take a look?:

Captainkirkdawson commented 2 years ago

I continue to object to the name we are talking about FreeReg Online Spreadsheet Transcription FROST. It is essential that we clearly identify that we are not talking about a General Transcription application of which there are hundreds and all aimed at freeform transcription. I also believe the team should pick the brains of @benwbrum

edickens commented 2 years ago

I like UniTT - Universal Transcription Tool. The only difference when using this for FreeREG will be the set of field names used and perhaps the output filename format.

edickens commented 2 years ago

To comment on Pat's post. When a transcriber logs in they have an Action called Batches. This is the screen they get:- Capture.JPG Files just need to be flagged as "In Progress". The rest can remain the same. A transcriber needs to be logged into the relevant Project and that will give UniTT the correct set of fields to use when they select "Upload a new file". The fewer changes we can make the less resistance we will have from transcribers to make a change.

Captainkirkdawson commented 2 years ago

I am sorry @edickens but UniTT - Universal Transcription Tool is 100 times worse. Universal Transcription is not what we would ever develop. Universal Transcription in the mainstream is focussed of the transcription of reports, meetings, etc etc usually from audio. You see it used on television many times a day e.g. a doctor looking at an image (or body) speaking into his 'phone'. Transcription takes that and produces a written report. Today it goes a great deal further to language translation. We are Spreadsheet Transcription. We are UniSTT which could so easily be misinterpreted or pronounced!