dolthub / dolt

Dolt – Git for Data
Apache License 2.0
17.74k stars 504 forks source link

write_file and write_pandas raise DoltException on Windows #1774

Closed PaulM5406 closed 3 years ago

PaulM5406 commented 3 years ago

Hello,

write_file and write_pandas raise DoltException on Windows. Here is the traceback. I do not know which is this other process that locks the file. If I run the command table import data_pisigma -c --pk=id temp.csv directly from the terminal, I get no error.

b'An error occurred moving data:\n\ncause: error garbage collecting the following files:\t.\\.dolt\\noms/4d3m60adthij431437g9v7l8uf4kigpg: remove .\\.dolt\\noms/4d3m60adthij431437g9v7l8uf4kigpg: The process cannot access the file because it is being used by another process.\n' Traceback (most recent call last): File "C:\Users\paolo\Dev\data_pisigma\main.py", line 67, in <module> write_file(dolt, File "C:\ProgramData\Anaconda3\envs\data_pisigma\lib\site-packages\doltcli\utils.py", line 73, in write_file _import_helper( File "C:\ProgramData\Anaconda3\envs\data_pisigma\lib\site-packages\doltcli\utils.py", line 199, in _import_helper dolt.execute(args + [fname]) File "C:\ProgramData\Anaconda3\envs\data_pisigma\lib\site-packages\doltcli\dolt.py", line 309, in execute output = _execute(args, self.repo_dir, outfile=outfile) File "C:\ProgramData\Anaconda3\envs\data_pisigma\lib\site-packages\doltcli\dolt.py", line 89, in _execute raise DoltException(str_args, out, err, exitcode) doltcli.dolt.DoltException: ('table import data_pisigma -c --pk=id C:\\Users\\paolo\\AppData\\Local\\Temp\\tmpqcm7hpaz.csv', b'Rows Processed: 55, Additions: 55, Modifications: 0, Had No Effect: 0', b'An error occurred moving data:\n\ncause: error garbage collecting the following files:\t.\\.dolt\\noms/4d3m60adth ij431437g9v7l8uf4kigpg: remove .\\.dolt\\noms/4d3m60adthij431437g9v7l8uf4kigpg: The process cannot access the file because it is being used by another process.\n', 1)

oscarbatori commented 3 years ago

We are taking a look at this, thanks for the bug report.

oscarbatori commented 3 years ago

Can we get a repro of this, like the contents of temp.csv?

EricTurner3 commented 3 years ago

I am having the same issue on Windows. The issue keeps stating it is coming from commit_date so I tried to specify it manually and it still errors out. Dolt version is 0.26.6, doltpy is 2.0.6, Python is 3.7.9.

main.py

from doltpy.cli import Dolt
from doltpy.cli.write import write_file
from datetime import datetime

dolt = Dolt.init('./tennis-players')
write_file(dolt,
           'great_players',
           open('./great_players.csv'),
           import_mode='force_create',
           primary_key=['id'],
           commit=True,
           commit_message='test commit',
           commit_date=datetime.now())

great_players.csv:

id, name
1, Eric
2, Bob
3, Frank

Trying to run the program:

 C:\Users\Eric\Documents\doltpy_write_test> python main.py                                                                                 b'An error occurred moving data:\n\ncause: error garbage collecting the following files:\t.\\.dolt\\noms/4bujnjuatdli0klda7ucot0tl836jujt: remove .\\.dolt\\noms/4bujnjuatdli0klda7ucot0tl836jujt: The process cannot access the file because it is being used by another process.\t.\\.dolt\\noms/b4jhr1nlmr9fin1a6fse4fqfilk2o1pd: remove .\\.dolt\\noms/b4jhr1nlmr9fin1a6fse4fqfilk2o1pd: The process cannot access the file because it is being used by another process.\t.\\.dolt\\noms/dr6ha44vervspsgb5psvtgprccinsca5: remove .\\.dolt\\noms/dr6ha44vervspsgb5psvtgprccinsca5: The process cannot access the file because it is being used by another process.\t.\\.dolt\\noms/t49qbt77gnvhn6mt2fvh1vgbmdv9h5t0: remove .\\.dolt\\noms/t49qbt77gnvhn6mt2fvh1vgbmdv9h5t0: The process cannot access the file because it is being used by another process.\n'
Traceback (most recent call last):
  File "main.py", line 13, in <module>
    commit_date=datetime.now())
  File "C:\Users\Eric\AppData\Local\Programs\Python\Python37\lib\site-packages\doltcli\utils.py", line 81, in write_file
    commit_date=commit_date,
  File "C:\Users\Eric\AppData\Local\Programs\Python\Python37\lib\site-packages\doltcli\utils.py", line 199, in _import_helper
    dolt.execute(args + [fname])
  File "C:\Users\Eric\AppData\Local\Programs\Python\Python37\lib\site-packages\doltcli\dolt.py", line 309, in execute
    output = _execute(args, self.repo_dir, outfile=outfile)
  File "C:\Users\Eric\AppData\Local\Programs\Python\Python37\lib\site-packages\doltcli\dolt.py", line 89, in _execute
    raise DoltException(str_args, out, err, exitcode)
doltcli.dolt.DoltException: ('table import great_players -c --pk=id C:\\Users\\Eric\\AppData\\Local\\Temp\\tmp37g_arpf.csv', b'Rows Processed: 3, Additions: 3, Modifications: 0, Had No Effect: 0', b'An error occurred moving data:\n\ncause: error garbage collecting the following files:\t.\\.dolt\\noms/4bujnjuatdli0klda7ucot0tl836jujt: remove .\\.dolt\\noms/4bujnjuatdli0klda7ucot0tl836jujt: The process cannot access the file because it is being used by another process.\t.\\.dolt\\noms/b4jhr1nlmr9fin1a6fse4fqfilk2o1pd: remove .\\.dolt\\noms/b4jhr1nlmr9fin1a6fse4fqfilk2o1pd: The process cannot access the file because it is being used by another process.\t.\\.dolt\\noms/dr6ha44vervspsgb5psvtgprccinsca5: remove .\\.dolt\\noms/dr6ha44vervspsgb5psvtgprccinsca5: The process cannot access the file because it is being used by another process.\t.\\.dolt\\noms/t49qbt77gnvhn6mt2fvh1vgbmdv9h5t0: remove .\\.dolt\\noms/t49qbt77gnvhn6mt2fvh1vgbmdv9h5t0: The process cannot access the file because it is being used by another process.\n', 1)
ghost commented 3 years ago

The reproduction code that @EricTurner3 just posted about is from https://docs.dolthub.com/interfaces/python

steel3d commented 3 years ago

Same issue with read_pandas:

df = read_pandas(dolt, 'IPv4ToCountry')

---------------------------------------------------------------------------
PermissionError                           Traceback (most recent call last)
<ipython-input-5-4a020434a859> in <module>()
      1 from doltpy.cli.read import read_pandas
----> 2 df = read_pandas(dolt, 'IPv4ToCountry')

c:\users\steve\appdata\local\programs\python\python36\lib\site-packages\doltpy\cli\read.py in read_pandas(dolt, table, as_of)
     26 
     27 def read_pandas(dolt: Dolt, table: str, as_of: str = None) -> pd.DataFrame:
---> 28     return read_pandas_sql(dolt, get_read_table_asof_query(table, as_of))

c:\users\steve\appdata\local\programs\python\python36\lib\site-packages\doltpy\cli\read.py in read_pandas_sql(dolt, sql)
     22 
     23 def read_pandas_sql(dolt: Dolt, sql: str) -> pd.DataFrame:
---> 24     return read_table_sql(dolt, sql, result_parser=parse_to_pandas)
     25 
     26 

c:\users\steve\appdata\local\programs\python\python36\lib\site-packages\doltcli\utils.py in read_table_sql(dolt, sql, result_parser)
     51     dolt: DoltT, sql: str, result_parser: Optional[Callable[[str], Any]] = None
     52 ) -> List[dict]:
---> 53     return dolt.sql(sql, result_format="csv", result_parser=result_parser)
     54 
     55 

c:\users\steve\appdata\local\programs\python\python36\lib\site-packages\doltcli\dolt.py in sql(self, query, result_format, execute, save, message, list_saved, batch, multi_db_dir, result_file, result_parser, **kwargs)
    583             with tempfile.NamedTemporaryFile() as f:
    584                 args.extend(["--result-format", "csv"])
--> 585                 output_file = self.execute(args, stdout_to_file=f.name, **kwargs)
    586                 if not hasattr(result_parser, "__call__"):
    587                     raise ValueError(

c:\users\steve\appdata\local\programs\python\python36\lib\site-packages\doltcli\dolt.py in execute(self, args, print_output, stdout_to_file, error)
    330                 output = repr(e)
    331         else:
--> 332             output = _execute(args, self.repo_dir, outfile=stdout_to_file)
    333 
    334         print_output = print_output or self._print_output

c:\users\steve\appdata\local\programs\python\python36\lib\site-packages\doltcli\dolt.py in _execute(args, cwd, outfile)
     79     logger.info(str_args)
     80     if outfile:
---> 81         with open(outfile, "w") as f:
     82             # _outfile = open(outfile, "w") if outfile else PIPE
     83             proc = Popen(args=_args, cwd=cwd, stdout=f, stderr=PIPE)

PermissionError: [Errno 13] Permission denied: 'C:\\Users\\Steve\\AppData\\Local\\Temp\\tmp7x1q3v_w'
ghost commented 3 years ago

This appears to be the case for Windows 10 Home Edition x64 English - May 2021 Update with v0.26.6 msi and Python 3.9.5 installer x64 bit.

I'm sure other versions have the same issues too, but just incase, I put the information for a specific version of software that is guaranteed to make this bug reproducible.

Screen Shot 2021-05-31 at 10 37 11 AM

ghost commented 3 years ago

I should mention that the script does in fact insert the data and recreate the table (even in update mode) even despite the garbage collection error messages. Running dolt gc from command prompt directly does not output any text at all (so I presume it works correctly).

Edit: For the sake of making debugging easier, and now that I know that the bug is reproducible with Windows 10 in a VM, I'm going to install Virtual Box Guest Additions and some tools to help me monitor what program is accessing what files (and potentially go through and set breakpoints on doltpy to see when the error message is activated).

Screen Shot 2021-05-31 at 10 42 04 AM

ghost commented 3 years ago

Not entirely sure if the buffer overflow is related to this particular bug, but it is triggered by write_file. The Python process was captured under Python.PML and the write_file action (the CLI side) was captured under Dolt_writefile.PML in the zip Debug.zip.

I used Microsoft's official process monitor to capture the data. https://docs.microsoft.com/en-us/sysinternals/downloads/procmon

Debug.zip

Screen Shot 2021-05-31 at 11 42 24 AM

ghost commented 3 years ago

As I didn't think to actually save the error message for the last zip, this is a "patch" zip that contains a new capture and a text document of the error message output by the script from the original zip. I did not include the script, repo, csv, or anything like that in this second zip.

With-Error-Message.zip

I also suspect that the buffer overflow may actually be related to the error message as one of the noms affected (and the only one I checked so far) is affected by the buffer overflow.

Screen Shot 2021-05-31 at 11 54 40 AM

Edit 2: These are all of the operations done to nom file midf4u9uho210ll0rft1rgmfirnc4lkl which is located at C:\Users\Alexis\Desktop\great-players-example\.dolt\noms\midf4u9uho210ll0rft1rgmfirnc4lkl and has 382 bytes.

Screen Shot 2021-05-31 at 12 02 07 PM

Edit 3: It may be the sharing access violation that triggers the bug itself as seen by the last screenshot (I believe it to be a symptom of the problem as why would specific noms have a sharing access violation?). I'm currently archiving my whole VM in a highly compressed 7zip archive for storage, so after that finishes, I'll be able to resume the VM and see what the full error message says.

Edit 4:

This may be the cause why this bug exists. https://groups.google.com/g/golang-nuts/c/8BiP_mPoCd4

The new screenshots are not as useful, but they should help show that GoLang is trying to request delete file and read attributes access when it gets a sharing violation exception. The link above shows a simple GoLang example that would cause the sharing violation to occur on Windows, but not cause problems on any other system.

Screen Shot 2021-05-31 at 2 28 51 PM Screen Shot 2021-05-31 at 2 29 18 PM

ghost commented 3 years ago

I decided to compile dolt cli using the Mac OSX version of Go Lang from https://golang.org/dl/ (which was named go1.16.4.darwin-amd64.pkg and had version go version go1.16.4 darwin/amd64). I also used this commit of Dolt CLI (the one for the latest release of v0.26.6 https://github.com/dolthub/dolt/commit/1eb286f34abc0f861aa855270a2aac4645f2db23) and I then I ran the command GOOS="windows" go build ./cmd/dolt in order to build the Windows executable on my Mac. It worked flawlessly on my Windows 10 installation whereas the prebuilt version produced the bug that is the topic of this issue.

I believe you may need to upgrade to the latest version of Go in order to compile the Windows version of Dolt CLI as the first bug report I saw related to the sharing violation exception was posted in December of 2020 (so relatively recently). So, the fix would be in a very, very new version of GoLang and therefore is not likely to exist in the CI build unless the CI build pulls the absolutely latest copy of GoLang before each build.

Edit: I did more testing to put my mind at rest to make sure this wasn't a fluke. It was. I captured it in process monitor again and the sharing violations are still there. So, to fix this issue, you'll most likely have to implement a Windows specific method of opening the temporary files so you can tell Windows that these files are deletable in order to fix the sharing violation exception. ☹️

Hydrocharged commented 3 years ago

We've just released a new version of Dolt that we think may fix this issue. It's a bit hard to test locally due to its non-deterministic nature. Give it a spin and let us know if you still run into the problem!

zachmu commented 3 years ago

Going to close this out for the time being, because we have a solution for windows that we think is as good as we can do for now.

Basically: os.Rename() on Windows is error prone due to the OS preventing access to files sometimes, for reasons not even the golang authors understand. The best we can do for now is retry failed Rename operations, which we are now doing.

If anyone can demonstrate a reliable repro for this, or runs into this again in the wild, please do let us know.