sot / ska_helpers

BSD 3-Clause "New" or "Revised" License
0 stars 1 forks source link

Checking if the git repo is dirty makes the SVN repo dirty on FOT Windows setup #44

Closed jskrist closed 1 year ago

jskrist commented 1 year ago

https://github.com/sot/ska_helpers/blob/7cc57f01f6de934baa535e16207c666ad4a957f2/ska_helpers/chandra_models.py#L311C8-L311C23

When this line runs, the index file in the .git directory is updated in some way, which causes SVN to report that this file is modified. This is not causing any errors in the MATLAB tools, but users are notified every time they open the FOT MATLAB Tools that their chandra_models directory contains modifications.

This seems to be called when chandra_aca.drift.load_drift_pars() is run. I'm not sure yet if the way that the index file is modified is consistent over time and across users. If it is, perhaps @matthewdahmer could make sure the aca model is always run before he commits the thermal models to the SVN repository to resolve this issue.

taldcroft commented 1 year ago

This is the first time this functionality has been really used in the MATLAB environment. Windows continues to surprise me. There is absolutely no reason that the index should be touched when just status'ing a repo.

I confirmed this does not happen on Mac and on my Windows VM. My testing:

conda activate ska3
cd ~/git/chandra_models
md5 (or md5sum) .git/index
  # some hash string
ipython
>>> import git
>>> repo = git.Repo(".")
>>> repo.is_dirty()
False
>>> exit
md5 (or md5sum) .git/index
  # The SAME hash string 
# Also the same modification date, size, etc as before.
taldcroft commented 1 year ago

The only other thing I can think of is a step before committing to SVN where you make all the files in .git be read-only. I don't know if this will create other problems.

taldcroft commented 1 year ago

An initial test of this chmod -R -w .git in git bash on Windows in that repo seems to give the expected result. I can get status but other index-changing commands fail.

jskrist commented 1 year ago

I have tried a variation of your steps outside of MATLAB to check if pyexec/MATLAB was doing something weird, and I still see the index file being modified (in a windows command prompt):

> cd Documents\MATLAB\FOT_Tools\Python\python3_Windows_64bit_trunk
> condabin\conda.bat activate base
> cd ..\..\Thermal_Models\chandra_models-3.49
> certutil -hashfile .git\index MD5
  # some MD5 string
> python
>> import git
>> repo = git.Repo(".")
>> repo.is_dirty()
>> exit()
> certutil -hashfile .git\index MD5
  # some different MD5 string

I will test out making the files read only.

taldcroft commented 1 year ago

One of the weird problems we've had with Windows and conda was (at some point) getting a git executable within conda that was not quite right. See https://github.com/sot/skare3/issues/1082. I wonder if this is related.

taldcroft commented 1 year ago

Well maybe we need to just avoid all git manipulations on Windows. Given the current use patterns this will probably be fine.

jeanconn commented 1 year ago

Should SVN just ignore the .git/index anyway?

jeanconn commented 1 year ago

Or would that mean SVN wouldn't be able to bring over the history for us to see with git? I get a little confused about our SVN git process here.

jskrist commented 1 year ago

It looks like:

jeanconn commented 1 year ago

I cannot use many (maybe any) of the commands in Library\usr\bin\ directory of the python environment, which includes commands like chmod, which, whoami. all of these commands throw an error related to issues with locale_ctype_ptr missing from a dll (which is sometimes reported as msys_intl_8.dll).

This was happening for me too. I thought those were installed from something else (not conda) in the Windows environment, but it sounds like we need to figure out where they come from and if they are incompatible or don't have access to a library they need. I also somewhat wondered if this was related to the 'missing .libs because of SVN' issue that happened for our last flight promotion.

jskrist commented 1 year ago

@jeanconn , My understanding is that the SVN repository contains the entirety of the git repository, which includes the .git directory. The process we use, as I understand it, is that Matt maintains his git repository, and when he's ready to promote a new tag, he makes a copy of the entire folder into the SVN directory, and commits everything to SVN.

jskrist commented 1 year ago

my suspicion is that these executables and dlls come from the python package msys2-conda-epoch, which is 7 years old, and may not be compatible any longer.

taldcroft commented 1 year ago

@jskrist - I have an idea to stop this from happening (with a bit of a hammer). I can get something out today for a new rc ska3-matlab release. Should we go down this path or accept the SVN warnings?

jskrist commented 1 year ago

@taldcroft, I think we can accept the SVN warnings for now. The risk is that if a user's local thermal models directory contains real, impacting changes (as opposed to this benign index file difference), they may not realize it, as they would be used to ignoring this warning.

Realistically, these files are not being significantly modified by anyone "accidentally", so I don't foresee this as being an urgent issue. Also, if any model spec files are locally modified, then the checksum reported in MCC reports that:

image

javierggt commented 1 year ago

So... you do not want us to patch that? Tom was suggesting just skipping this is_dirty check whenever THERMAL_MODELS_DIR_FOR_MATLAB_TOOLS_SW is defined, and that would be a simple change, I think.

jeanconn commented 1 year ago

We are going to figure out a fix one way or the other, the question was really about if James and the FOT need a fix in ska3-matlab 2023.5 for 2023_030 or if it can wait until 2023_040. I heard from his answer that this can wait until 2023_040.

jskrist commented 1 year ago

That does seem like a fairly straight forward fix. I just don't want to rush into a "quick-fix", as I don't think this issue is urgent. I'm OK waiting until 2023_040 for a fix.

taldcroft commented 1 year ago

Sounds good. I just looked into making the fix and quickly ran into a question I can't answer: does simply creating a git.Repo() object cause this index dirtying? Probably not, but I wouldn't be confident without testing.

jskrist commented 1 year ago

@taldcroft , creating the object, does not dirty the SVN repo. The file only gets modified when the is_dirty() method is called.

jeanconn commented 1 year ago

It looks to me like "git status" touches index too, but rev-parse does not, so this doesn't look to be a problem for the new code in #42.

jeanconn commented 1 year ago

It also looks like "git status" touches index, but "git --no-optional-locks status" does not, so perhaps there's an option to also disable the index-touching pieces that we don't care about as well. Not doing is_dirty() does seem like it would also be fine for the windows THERMAL_MODELS_DIR_FOR_MATLAB_TOOLS_SW case as well, as if the repo is dirty on the matlab side we should be getting the svn protections... I just was curious about what git is doing with the index.