marian-nmt / marian

Fast Neural Machine Translation in C++
https://marian-nmt.github.io
Other
1.22k stars 228 forks source link

Temp file creation seems to be broken in the Windows build #334

Open TommiNieminen opened 4 years ago

TommiNieminen commented 4 years ago

I compiled the latest Windows build and tried to start training, but I get the "Return value is not equal to streambuf pointer, that is weird" error from the MakeTemp method in file_stream.cpp.

I don't know much C++, but it seems the problem is that the temp file is first created with the open command from Windows C Runtime, and the filebuf.open is used to open the file before the file descriptor from the CRT's open command is closed. Moving the line 156

_ABORTIF(close(fd), "Can't close file descriptor", name);

before the filebuf.open on line 149 makes the error go away.

After this you get another error, though, this time from InputFileStream: "File cannot be opened". The reason for this is that the file_ property is not set in the MSC_VER part of MakeTemp (file_ is only set in the non-MSC_VER part of the code on line 144). file_ is used to pass the temp file name to the InputFileStream, so the temp file can't be loaded. This can be fixed by adding the following line to MakeTemp:

file_ = std::string(name);

After this, one error still occurs. TemporaryFile has a default value of true for the earlyUnlink property. This means that the remove function is called for the temp file in TemporaryFile constructor on line 96. The problem is that the file is still in use at that point, so remove will fail in Windows (apparently in POSIX systems this won't cause an error). After changing the default of earlyUnlink property to false the training will start without errors. Not sure if the temp files are removed properly afterwards.

Also, the tempdir parameter does nothing in the Windows build at the moment, since the tempnam function that's used in MakeTemp will use the value of the %TMP% env variable as the tempdir, unless %TMP% is empty. So to make tempdir work, you need to add _putenv("TMP=");, e.g. after line 112:

_#ifdef _MSCVER if(base.substr(0, 4) == "/tmp") base = getenv("TMP"); else _putenv("TMP=");

TinoDidriksen commented 1 year ago

I also just hit this issue. file_ is now correctly set in master, but I still needed to fix order of close vs. open and removing the temporary file.

This also applies to marian-dev.