NeuralEnsemble / python-neo

Neo is a package for representing electrophysiology data in Python, together with support for reading a wide range of neurophysiology file formats
http://neo.readthedocs.io/en/latest/
BSD 3-Clause "New" or "Revised" License
316 stars 249 forks source link

BaseTestRawIO's setUp function deletes data needed for subsequent tests #1550

Open nikhilchandra opened 1 week ago

nikhilchandra commented 1 week ago

Describe the bug I am putting together a unit test for issue #1546 (Link) but when I try to run unittest.main() from test_plexon2rawio.py, the code breaks.

To Reproduce

  1. I added a unit test called "test_check_enabled_flags" to TestPlexon2RawIO in test_plexon2rawio.py.
  2. I ran unittest.main() from test_plexon2rawio.py
  3. I think that BaseTestRawIO's setUp() function runs before each unit test. So before test_check_enabled_flags actually ran, setUp() called download_dataset() from neo.utils.dataset. On Line 69, the command "dataset=datalad.api.install(path=local_folder, source=repo)" triggered lazy loading of the data repository to my $HOME folder (files are all 1 KB large)
  4. Within download_dataset() there is a subsequent command on line 75 -- dataset.get(remote_path). With remote_path set to "plexon", this triggered the actual download of the plexon files. Here are the contents of the plexon folder after running this command: image
  5. My unit test ran fine.
  6. BaseTestRawIO has another unit test called test_read_all. Before this ran, setUp() was called again, and hence download_dataset(). This time, since ephy_testing_data had already been downloaded to my $HOME folder, and so a different branch of the if/else statement ran. On line 66, the command "repo.call_git(["checkout", "--force", "master"]) seemed to wipe out a subset of the files contained within the folder "plexon". Here are the contents after running this command: image
  7. When dataset.get(remote_path) runs for the 2nd time, nothing happens -- the files remain 1 KB.
  8. When test_read_all runs, it tries to load 4chDemoPL2.pl2, which is empty. All values contained in the file_info header are zero. Therefore, when the code tries to run compliance.count_element(reader) on Line 114 of common_rawio_test.py, it breaks on a ZeroDivisionError. Here is the error message:

`Traceback (most recent call last): File "", line 1, in File "C:\Users\nikhil\Documents\Projects\python-neo\neo\rawio\baserawio.py", line 463, in segment_t_start return self._segment_t_start(block_index, seg_index) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\nikhil\Documents\Projects\python-neo\neo\rawio\plexon2rawio\plexon2rawio.py", line 379, in _segment_t_start return self.pl2reader.pl2_file_info.m_StartRecordingTime / self.pl2reader.pl2_file_info.m_TimestampFrequency


ZeroDivisionError: float division by zero`

**Expected behaviour**
I would expect subsequent calls to setUp() to **not** erase a subset of files as is currently happening.

**Environment:**
 - OS: Windows 11
 - Python version: 3.11.9
 - Neo version: 0.14.0dev
 - NumPy version: 1.26.4

**Additional context**
Add any other context about the problem here.
zm711 commented 1 week ago

I wonder if this is a problem with Windows. One time Alessio and I were trying to share some files and he ended up only being able to share the stub with me from Windows. Could you give me the exact command you are using to try to run the test?

nikhilchandra commented 1 week ago

I literally just ran test_plexon2rawio.py in my IDE (Visual Studio Code) by pressing Ctrl+F5.

nikhilchandra commented 1 week ago

I wonder if this is related

https://handbook.datalad.org/en/latest/intro/windows.html

samuelgarcia commented 1 week ago

I think we should move this setUp to classSetUp.

zm711 commented 1 week ago

@samuelgarcia, I trust you to do that. As a pytest baby I would hate to mess it up :)

h-mayorquin commented 1 week ago

https://github.com/NeuralEnsemble/python-neo/pull/1553

zm711 commented 1 week ago

We merged the PR moving the test setUp into main, but if you run the tests on your Windows machine and this issue persists could you open a new issue @nikhilchandra. If you still have problems it is likely datalad related. I'll leave this open for a bit though.

nikhilchandra commented 4 days ago

Hi @zm711, I cloned the NeuralEnsemble master branch to my Windows desktop and was able to run the test successfully once (this was after deleting the ephys_testing_data folder from $HOME). But the second time around, it once again deleted a subset of files in the plexon folder, causing the same break as before.

nikhilchandra commented 4 days ago

I did some further digging. I think the issue is with the command "repo.call_git(["checkout", "--force", "master"]). This is a git annex repo, not a pure git repo, and so doing a forced checkout causes the previously download files to be replaced with 1 kb symbolic links. The options for a fix would be to either replace the forced checkout with something more appropriate (not sure what), or we could just run an unlock command after:

repo.call_git(["checkout", "--force", "master"])
repo.unlock(remote_path)

I tried this locally and it fixed the problem, but I hesitate to submit a pull request just yet without further inputs @h-mayorquin @samuelgarcia @zm711 @cheydrick

zm711 commented 3 days ago

Hey @nikhilchandra, I would give the PR a quick try with your solution. If it breaks our CI then we will need to work on a different solution, but I as someone who has tried this on Windows have also struggled with the stubs/symlink stuff. So I switched to do Mac for most Neo dev. But I think it is worth a try; I don't know if any of us are actually git-annex or datalad experts so giving it a test is worth a try and we can close the PR if needed.