metrumresearchgroup / bbi

Next generation modeling platform
11 stars 2 forks source link

Don't clean up .msf or .MSF by default #320

Open kylebaron opened 1 month ago

kylebaron commented 1 month ago

I think currently .msf gets cleaned up but not .MSF

kyleam commented 1 month ago

With a quick search of the issue tracker and git logs, I wasn't able to find a rationale for why *.msf files are removed. If I understand correctly, you need the msf files as input for a downstream simulation.

It seems okay to me to change the behavior here, adding msf files at a higher level (related to gh-194). The fallout of that change in behavior would just be people noticing the file when they don't want it and then increasing their default clean level.


Here's the main spot that'd need to be adjusted to avoid having those cleaned at the end of the run: https://github.com/metrumresearchgroup/bbi/blob/835953efed10fff2d7024ac1c3dd88949b268c05/cmd/nonmem.go#L573-L586

There are also some other spots that would need to be adjusted. (We could use some cleanup in this area to avoid near-duplicate logic/values, some of which I don't think actually come into play for the current runner.)


Some scattered questions and notes on things I don't understand:

kylebaron commented 1 month ago

@kyleam Information on model specification files: https://ghe.metrumrg.com/pages/mrg/nm-help/nm750/html/modelspe.htm

As far as I know, if there is a .msf file in the completed run directory, it is because the user requested it in $EST (https://ghe.metrumrg.com/pages/mrg/nm-help/nm750/html/$estimat.htm). So I don't think it makes sense to automatically purge these files; if the user didn't want them they shouldn't be requested.

See https://ghe.metrumrg.com/pages/mrg/nm-help/nm750/html/$msfi.htm The _ETAS.msf, _RMAT.msf and _SMAT.msf are all produced when the user requests msf output via MSFO option in $EST and requests the $COVARIANCE step. If the user does try to utilize the .msf output, NONMEM will look for the _ETAS.msf file and warn if it can't find it. It won't warn if it can't find the _RMAT.msf or _SMAT.msf file. So I'm thinking we leave all of the files there. If we wanted to clean up anything, it would be the _RMAT.msf file or _SMAT.msf file b/c I don't think we (ourselves) really use them much. But if someone did request .msf output intending to get _RMAT.msf, it might be weird that bbi nukes it anyway.

With NONMEM 7.3 and later, when MSF or MSFO option is used to  specify
 an MSFO file in the $EST record e.g.,
 $EST ... MSFO=msfroot.msf
 then  in  addition  to  the  main  MSF  file  msfroot.msf,  file  msf-
 root_ETAS.msf containing individual etas will also  be  produced,  and
 provide additional information when a $MSFI record is used in a subse-
 quent problem or control stream.  This is referred to  as  an  "extra"
 msf  file.   If  the  Covariance  Step is also implemented, files msf-
 root_RMAT.msf and msfroot_SMAT.msf containing intermediate information
 on  the R matrix and S matrix will also be produced.  These files pro-
 vide information when a $MSFI record along  with  a  $COV  ...  RESUME
 record is used in a subsequent problem or control stream.

 The use of an extension, e.g., .msf, is optional.  If the _ETA file is
 not present, NONMEM issues a warning:
 WARNING: EXTRA MSF FILE COULD NOT BE OPENED: c5msf2x_ETAS

 There is no warning if _SMAT and/or _RMAT are not present.