sergiocorreia / stata-require

Enforce exact/minimum versions of community-contributed packages.
MIT License
17 stars 0 forks source link

-require- stumbles when the same program has different versions on different Stata repositories #1

Open meadover opened 2 months ago

meadover commented 2 months ago

Thanks for updating and maintaining -require-. I am attempting to use it with my students. We ran into a problem which may be more general and might be at least partially solvable.

Nick Cox has just published an updated version of his useful utility -distinct- in the SJ-23-4. My students and I have been using his 2012 version which is on SCC. I would like to type something like:

require distinct>= 1.4.3, install from("http://www.stata-journal.com/software/sj23-4")

Issues: Depending on specified options, -require- will

test_require.do.txt test_require_nmbrd.smcl.txt

(Remove the extension -txt- to open the above files in Stata.)

Of course, Nick will probably replace his existing SSC version with his updated programs eventually. So to see these potential issues with require, I have captured it in the SMCL file. I imagine it could occur frequently for programs whose package names are different from the program name, such as all those published by the Stata Journal.

Mead

sergiocorreia commented 2 months ago

Hi Mead,

Thanks for the very detailed issue!

What you found touches on two current limitations of require:

1) Poor support for SJ packages

In particular, see these two lines:

https://github.com/sergiocorreia/stata-require/blob/main/src/require.ado#L308 https://github.com/sergiocorreia/stata-require/blob/main/src/require.ado#L472

The problem is that require has no way of knowing that dm0042_5 and distinct are equivalent:

It shouldn't be too difficult to build a mapping between package names and their SJ versions, but this mapping will get obsolete frequently (every time an SJ package is updated). Thus, I'm not sure of the best approach here

2) Uninstall as a condition for installing

Currently, require only reads files locally, so it cannot tell whether install will be successful in providing the required version number until it installs it (which requires uninstalling existing versions).

That said, we should be able to tweak require into reading the version from the online copy of the files, and we should be able to add this into a 1.4.1 release in the short term.

3) Beyond require

A key limitation of require is that we don't have a list of all the available version numbers. But we've done some progress on this front using Lars Vilhubers' SSC mirror, so we should be able to get a list of every version from the last ~2 years or so (which wouldn't be part of require, and will probably be just a .dta online updated daily). If we do this it might provide another way of also fixing problem #1

Cheers, S

meadover commented 2 months ago

Thanks, Sergio. I take all your points. Thanks for pointing to the code exceptions you inserted for a few specific SJ published programs. I agree that you should try to prevent -require- from uninstalling a program which it cannot replace.

Another question/suggestion:

In searching through the code for version 1.4 of -require.ado-, I don't see any reference to the file -stata.trk- which Stata maintains in the root directory of PLUS.

On the other hand, in searching the code of version 1.1.8 10feb2021 of -adoupdate.ado-, I note that the Mata code frequently references the -stata.trk- file. The code includes Mata suboutines named -read_statatrk()- , –read_statatrk_element-, -read_statatrk_skiphdr- and -ffn_of_statatrk()- which it seems to be using to parse the stata.trk file to answer some of the questions about installed packages that you are answering using Stata's -regexm()- and -strpos()- functions.

My grasp of Mata is too elementary for me to figure out whether you could use these subroutines to your advantage. Is there a reason that you are not using the -stata.trk- file that Stata is conveniently maintaining?

Stata's -adoupddate- seems to use a different decision rule for updating community-contributed packages and SJ published packages. From my experience, for a community-contributed package, -adoupdate- uses only the "Distribution Date" in the package file . Weirdly, in the help file for "usersite", which is otherwise quite complete, there is no mention of the need for a "distribution date" in each community-contributed package file. The only mention of this requirement that I have found in the documentation is in the help file for -adoupdate- . From inside Stata, type "help adoupdate##developers".

Apparently, for a file distributed by Stata for the SJ, -adoupdate- seems to compare only the suffix of the package name, ignoring the distribution date.

Supporting my guess about -adoupdate-'s algorithm are the executions of -net describe- and -ado-describe- that are in the numbered SMCL file I sent you. The command -ado describe distinct- on line 121 reports the contents of my -stata.trk- file for Nick's earlier version of -distinct-, which I installed from SSC. In Line 148 you see the "Distribution Date" that he or someone at Boston College added to his package file.

In contrast, check out the contents of my -stata.trk- file for the SJ published package -dm0042_5- beginning on line 266. Note that there is no distribution date. My -stata.trk- entry for this package is the same as the online version of the package which can be seen on Lines 52-77 or by typing:

      net describe dm0042_5 , from("http://www.stata-journal.com/software/sj23-4")

Clearly the purpose of -require- is broader that that of Stata's -adoupdate-. However, it seems that when a user's -require- command uses the ">=" operator instead of the "==" operator for a community contributed package, their purposes largely coincide.

Thanks again for your work to improve replicability with the -require- command. The program also helps with teaching, I find.

Mead


From: Sergio Correia @.> Sent: Saturday, April 13, 2024 10:19 PM To: sergiocorreia/stata-require @.> Cc: Mead Over @.) @.>; Author @.***> Subject: Re: [sergiocorreia/stata-require] -require- stumbles when the same program has different versions on different Stata repositories (Issue #1)

Hi Mead,

Thanks for the very detailed issue!

What you found touches on two current limitations of require:

  1. Poor support for SJ packages

In particular, see these two lines:

https://github.com/sergiocorreia/stata-require/blob/main/src/require.ado#L308 https://github.com/sergiocorreia/stata-require/blob/main/src/require.ado#L472

The problem is that require has no way of knowing that dm0042_5 and distinct are equivalent:

It shouldn't be too difficult to build a mapping between package names and their SJ versions, but this mapping will get obsolete frequently (every time an SJ package is updated). Thus, I'm not sure of the best approach here

  1. Uninstall as a condition for installing

Currently, require only reads files locally, so it cannot tell whether install will be successful in providing the required version number until it installs it (which requires uninstalling existing versions).

That said, we should be able to tweak require into reading the version from the online copy of the files, and we should be able to add this into a 1.4.1 release in the short term.

  1. Beyond require

A key limitation of require is that we don't have a list of all the available version numbers. But we've done some progress on this front using Lars Vilhubers' SSC mirror, so we should be able to get a list of every version from the last ~2 years or so (which wouldn't be part of require, and will probably be just a .dta online updated daily). If we do this it might provide another way of also fixing problem #1https://github.com/sergiocorreia/stata-require/issues/1

Cheers, S

— Reply to this email directly, view it on GitHubhttps://github.com/sergiocorreia/stata-require/issues/1#issuecomment-2053858192, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ALEY4QY2UZKY52FI7IOICZDY5HRTFAVCNFSM6AAAAABGF4LBYSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANJTHA2TQMJZGI. You are receiving this because you authored the thread.