ScottishCovidResponse / SCRCIssueTracking

Central issue tracking repository for all repos in the consortium
6 stars 0 forks source link

Extract the txt-to-XML converter from the BICI GUI #33

Open ianhinder opened 4 years ago

ianhinder commented 4 years ago

This is an essential piece of the pipeline for constructing the final input file for bici, but it's not in git. It currently exists only in the GUI version of bici (http://www.mkodb.roslin.ed.ac.uk/EAT/BICI.html), which is a 300 MB download. Chris says it is probably all within a single javascript file. This is probably BICI.app/Contents/Resources/app.nw/js/simulate.js, probably

function simsetup() // Sets up the simulation xml file

Extract this, make it work standalone, e.g. with something like node.js. It's awkward to pull in such a dependency, so eventually maybe we want to rewrite it in something else (e.g. python or C++) but for now, the fastest thing is to keep it in javascript.

ianhinder commented 4 years ago

Chris says:

Regarding the data converter: The file “import.js” takes the output from the “GenerateInputFIle” directory (e.g. Scotland_model .txt) and uses it to specify the model. Details of this are in the manual in Appendix A.

The function “startinference” in “Inference.js” then converts the model into xml format. It also generates data, but that part is not necessary because it is performed in “GenerateInputFIle” directory.

jholloc commented 4 years ago

Doesn't look like I can assign myself to this but i'm happy to take a look.

jholloc commented 4 years ago

Do you know where I might find the Scotland_model.txt? The examples in the BICI download don't seem to work (they have transition type='exponential' which throws an error).

jholloc commented 4 years ago

Nevermind, I found all the models in the BICI github.

jholloc commented 4 years ago

I'm getting the following error when trying to load the 'Scotland_model.txt' using the BICI GUI (and also when using the CLI version i'm creating):

On line 24: 'S08000015' is not a classification name

ianhinder commented 4 years ago

I've never tried to do it from the GUI. I'll escalate this to Chris; maybe he knows the reason. It's possible that the BICI GUI is out of date. Maybe he has a newer one.

ianhinder commented 4 years ago

Are you using the version available from http://www.mkodb.roslin.ed.ac.uk/EAT/BICI.html? Mac or Windows?

ianhinder commented 4 years ago

I get the same when I try it using the macOS version, which was downloaded as http://www.mkodb.roslin.ed.ac.uk/EAT/BICI_v1.0_Mac.zip.

image

jholloc commented 4 years ago

Are you using the version available from http://www.mkodb.roslin.ed.ac.uk/EAT/BICI.html? Mac or Windows?

I am using the Windows one, but I see you're getting the same error on the Mac version.

ianhinder commented 4 years ago

Chris says:

The version on the website is an old one. The current one is under development, but if you copy the attached files into the “js” directory of your downloaded version it should be able to do the conversion (I think).

  1. Click New
  2. Click import and select “Scotland_model.txt”
  3. Click on “inference -> Posterior” on the menu
  4. Click on the “Init” button. This should create an init.xml file. We want a programmatic way to go straight from “Scotland_model.txt” to “init.xml” js.zip
jholloc commented 4 years ago

Thanks. With the updated javascript I now have a working command line tool that can read a model text file and generate the XML file.

ianhinder commented 4 years ago

Excellent! Does the generated XML diff identically with the one in the repo? Can you stick it in the repo on a branch, along with some brief info about dependencies and how to run it? I'm going to do more thinking about the workflow for generating the input files in general, so I might play around with it a bit before it's finalised. Just having it runnable from the command line, and in git, is a massive improvement already; thanks very much!

jholloc commented 4 years ago

The only difference in the XML is: My init.xml: <inference nsamp="20000001" tmin="0" tmax="122" indmax="100000"/> Scotland_model.xml: <inference nsamp="20000001" tmin="0" tmax="122" indmax="undefined"/>

jholloc commented 4 years ago

I have checked in the code on a new branch (text2xml) and added some notes in the README to cover the usage of the tool. There are no dependencies apart from node.js so it shouldn't require and installation, the script should be able to be run as is.

ianhinder commented 4 years ago

Great! I'll take a look. Regarding the indmax difference above: there should be three versions of the XML file:

  1. The one in the repository
  2. One generated by the BICI GUI
  3. One generated by your command line version of 2

Which are the two which differ by indmax? I'm guessing the difference is between 1 and 3, and that 2 is the same as 3, meaning that the version in the repository wasn't generated by exactly the same version of the code that Chris sent. Alternatively, if 2 and 3 are different, this might be caused by some global variable state that hasn't been properly captured when extracting the required code, and presumably needs to be debugged.

ianhinder commented 4 years ago

Really good that we've got this running. Now we can do a bit of tidy-up. A few questions on the branch:

  1. Did you import all the js files, or just the ones needed for the conversion? (I'd say either is fine, I'm just curious)
  2. Can you add a command line argument to specify the output filename? It should make things a bit cleaner when we're scripting the workflow later.
  3. The "machine" argument, which goes into "ver", doesn't seem to be used. Can you confirm, and if so, remove it?
jholloc commented 4 years ago

Hi. I've fixed the difference in the XML - it was a missing call to initvar() to initialise some global state.

ianhinder commented 4 years ago

I've just tried it out. Node was easy to install on my Mac ("sudo port install nodejs14" with MacPorts, and it came in with no extra dependencies). However, while I get the success message, it doesn't seem to actually generate a file!

$ git describe --always --dirty --long
1479e37
$ ls *.xml
Scotland_bici_fixd_model.xml    Scotland_bici_input.xml     UK_bici_fixd_model.xml      UK_bici_model.xml
Scotland_bici_fixdg_model.xml   Scotland_bici_model.xml     UK_bici_fixdg_model.xml
$ node text2xml mac Scotland_model.txt 
init file created!
$ ls *.xml
Scotland_bici_fixd_model.xml    Scotland_bici_input.xml     UK_bici_fixd_model.xml      UK_bici_model.xml
Scotland_bici_fixdg_model.xml   Scotland_bici_model.xml     UK_bici_fixdg_model.xml
$ 

Can you check that it works for you, and you're not just looking at an old version of the file?

jholloc commented 4 years ago

Really good that we've got this running. Now we can do a bit of tidy-up. A few questions on the branch:

  1. Did you import all the js files, or just the ones needed for the conversion? (I'd say either is fine, I'm just curious)
  2. Can you add a command line argument to specify the output filename? It should make things a bit cleaner when we're scripting the workflow later.
  3. The "machine" argument, which goes into "ver", doesn't seem to be used. Can you confirm, and if so, remove it?
  1. I have only imported (and committed to the repo) the javascript actually needed.
  2. I can only specify the filename by modifying the BICI GUI javascript as it is hardcoded.
  3. The ver variable is used in inference.js (line 885) and that function won't run without it defined (though we can just set it to one of the values, i.e. "windows")
jholloc commented 4 years ago

Can you check that it works for you, and you're not just looking at an old version of the file?

For some reason when the machine is set to "mac" the file that is created is "/tmp/init.xml". (Also set in inference.js line 885).

ianhinder commented 4 years ago
  1. I can only specify the filename by modifying the BICI GUI javascript as it is hardcoded.

Since this is probably only a stopgap measure anyway, and we've only brought in the source files needed for the conversion, this copy isn't going to be used for the GUI in future, so I don't see a problem with modifying it. I guess it would need the variable to be passed through. If it's too painful, then don't bother, but if it's easier, it would be neater and potentially less error-prone (e.g. imagine for some reason we wanted to have more than one conversion running in the same directory at the same time; using a hard-coded output path makes that impossible).

  1. The ver variable is used in inference.js (line 885) and that function won't run without it defined (though we can just set it to one of the values, i.e. "windows")

Ah, I searched the diff on the github page, and that was suppressing large diffs. Do you know what the OS is used for? Line endings? I'd remove the "machine" feature and make it work the same across OSes, assuming that's not too complicated.

For some reason when the machine is set to "mac" the file that is created is "/tmp/init.xml". (Also set in inference.js line 885).

Maybe to work around sandboxing issues. Anyway, for us, I'd keep things consistent across OSes. I confirm that I have a /tmp/init.xml file generated.

jholloc commented 4 years ago

I've just pushed a commit that should fix both 2 and 3. The new command line usage is:

node text2xml model_file output_file
ianhinder commented 4 years ago

Hmm. Now I get:

$ git describe --always --dirty --long
6372ce3
$ git diff
$ node text2xml.js Scotland_model.txt outfile.xml
undefined:885
    switch(ver){
    ^

ReferenceError: ver is not defined
    at startinference (eval at <anonymous> (/Users/h52229ih/Documents/Projects/BICI/src/BICI/GenerateInputFile/text2xml.js:14:1), <anonymous>:885:2)
    at Object.<anonymous> (/Users/h52229ih/Documents/Projects/BICI/src/BICI/GenerateInputFile/text2xml.js:98:1)
    at Module._compile (internal/modules/cjs/loader.js:1185:30)
    at Object.Module._extensions..js (internal/modules/cjs/loader.js:1205:10)
    at Module.load (internal/modules/cjs/loader.js:1034:32)
    at Function.Module._load (internal/modules/cjs/loader.js:923:14)
    at Function.executeUserEntryPoint [as runMain] (internal/modules/run_main.js:71:12)
    at internal/main/run_main_module.js:17:47
$ 

Maybe you forgot to commit the changes to one of the files?

jholloc commented 4 years ago

Yes, I forgot to commit the updated inference.js. I've done this now.