alliedmodders / sourcemod

SourceMod - Source Engine Scripting and Administration
http://www.sourcemod.net/
974 stars 422 forks source link

Engine throws ERROR_HANDLE_EOF when use_valve_fs turned on #1567

Closed dragokas closed 3 years ago

dragokas commented 3 years ago

Help us help you

Environment

Description

Following code causing server console to throw error 38:

Problematic Code (or Steps to Reproduce)

#pragma semicolon 1
#pragma newdecls required

#include <sourcemod>

public void OnPluginStart()
{
    File hr = OpenFile("resource/valve_english.txt", "rb", true);
    if( hr )
    {
        int bytesRead, buff[1024];

        while( !hr.EndOfFile() )
        {
            bytesRead = hr.Read(buff, sizeof(buff), 1);

            PrintToServer("bytes read: %i", bytesRead);
        }
        delete hr;
    }
}

Logs

...
bytes read: 1024
bytes read: 1024
bytes read: 1024
Unknown read error 38
bytes read: 772

If you set "use_valve_fs" argument to false, an error is not appearing. I'll clarify the file is still located locally, not in a container.

Error is specific to CS:GO. For L4D1 & L4D2 everything works normally.

Wend4r commented 3 years ago

This file has an incompatible UTF-16 LE encoding. I recommend writing you an extension to read and decode this file in UTF-8 and already get the values by the key of the phrase by your natives, or use File.ReadInt32 and decode directly into SourcePawn, which my practice has shown is very slow in terms of the speed of the code.

dragokas commented 3 years ago

I know. But, the issue here is not about re-encoder.

offtopic: Anyway, I already wrote one in SP/SourceMod language and I have not seen any tangible loss in speed. The only significant loss in speed has happened for some reason in hr.Read() function itself, which would be a separate issue from me when I'll have time to make and present detailed proof. As about File.ReadInt32, I'm already reading data as Int16 per 512 blocks (no reasons for 4 bytes per operation):

int buff[512];
...
bytesRead = hr.Read(buff, sizeof(buff), 2);

That works nicely for SSD, but not for HDD. Anyway, that will be a separate issue about the case. Thanks for suggesting to create an extension, but it would be a waste in my case, because my use case is designing a general .inc file (without dependencies) to use by other devs.

Wend4r commented 3 years ago

As about File.ReadInt32, I'm already reading data as Int16 per 512 blocks (no reasons for 4 bytes per operation)

Yes, to decode UTF-16, it is enough to read 2 bytes each, but I thought it would be faster not to constantly turn to the drive and I have a breakdown of 2 bytes for decode (although the general situation with a slow SourcePawn did not solve this).

Wend4r commented 3 years ago

I have a SATA 3 SSD and parsing all resource/csgo_*.txt files through SourcePawn takes me about 5 seconds (the parse speed significantly depends on the efficiency of the single core, in my test, is used the Core i7 7700k processor)

Parse file "resource/csgo_english.txt" (5421686 bytes) in 0.093750 seconds
Parse file "resource/csgo_german.txt" (11028662 bytes) in 0.187500 seconds
Parse file "resource/csgo_french.txt" (10910602 bytes) in 0.171875 seconds
Parse file "resource/csgo_italian.txt" (10983180 bytes) in 0.187500 seconds
Parse file "resource/csgo_koreana.txt" (5287312 bytes) in 0.093750 seconds
Parse file "resource/csgo_spanish.txt" (10984290 bytes) in 0.187500 seconds
Parse file "resource/csgo_schinese.txt" (8963906 bytes) in 0.156250 seconds
Parse file "resource/csgo_tchinese.txt" (7860886 bytes) in 0.140625 seconds
Parse file "resource/csgo_russian.txt" (10561830 bytes) in 0.187500 seconds
Parse file "resource/csgo_thai.txt" (10379882 bytes) in 0.187500 seconds
Parse file "resource/csgo_japanese.txt" (4654104 bytes) in 0.078125 seconds
Parse file "resource/csgo_portuguese.txt" (10902344 bytes) in 0.171875 seconds
Parse file "resource/csgo_polish.txt" (11222202 bytes) in 0.187500 seconds
Parse file "resource/csgo_danish.txt" (10160338 bytes) in 0.171875 seconds
Parse file "resource/csgo_dutch.txt" (10016106 bytes) in 0.171875 seconds
Parse file "resource/csgo_finnish.txt" (10683162 bytes) in 0.171875 seconds
Parse file "resource/csgo_norwegian.txt" (10272296 bytes) in 0.171875 seconds
Parse file "resource/csgo_swedish.txt" (10528326 bytes) in 0.171875 seconds
Parse file "resource/csgo_romanian.txt" (10300450 bytes) in 0.171875 seconds
Parse file "resource/csgo_turkish.txt" (10747150 bytes) in 0.187500 seconds
Parse file "resource/csgo_hungarian.txt" (10681084 bytes) in 0.187500 seconds
Parse file "resource/csgo_czech.txt" (10662508 bytes) in 0.171875 seconds
Parse file "resource/csgo_brazilian.txt" (10768708 bytes) in 0.187500 seconds
Parse file "resource/csgo_bulgarian.txt" (11396774 bytes) in 0.187500 seconds
Parse file "resource/csgo_greek.txt" (11011342 bytes) in 0.187500 seconds
Parse file "resource/csgo_ukrainian.txt" (10693908 bytes) in 0.187500 seconds
Parse file "resource/csgo_latam.txt" (10986412 bytes) in 0.187500 seconds
Parse file "resource/csgo_vietnamese.txt" (10255662 bytes) in 0.171875 seconds
L 08/27/2021 - 17:29:43: [item_view.smx] Successfully parse all translations files in 4.718750 seconds.

Based on this, I recommend that you write an extension so that it takes at best milliseconds for all files

dvander commented 3 years ago

5s to parse ~200MB of text? And assumedly only on startup? Seems pretty good for something that hasn't been profiled or optimized in any way. Please don't write extensions on this basis alone - every OS has some kind of profiling tool.

Wend4r commented 3 years ago

@dvander,

My code ```sourcepawn void VeryBIGTranslationParse() // Entry function. { decl bool bToggleInt16; int i = 0; decl int iInt32, iInt16; float flStartTime = GetEngineTime(), flStartFileTime = flStartTime; decl char sFilePath[PLATFORM_MAX_PATH], sBuffer[10000]; decl File hFile; s_iTranslation = 0; while(s_iTranslation != MAX_LANGUAGES) { if(s_hLanguageTextMap[s_iTranslation]) { s_hLanguageTextMap[s_iTranslation].Clear(); } else { s_hLanguageTextMap[s_iTranslation] = new StringMap(); } FormatEx(sFilePath, sizeof(sFilePath), "resource/csgo_%s.txt", s_sLanguegeFiles[s_iTranslation]); // Parse UTF-16 symbols for translate to UTF-8. if((hFile = OpenFile(sFilePath, "rb"))) { // Skip BOM title. hFile.Seek(2, SEEK_SET); // while(!hFile.EndOfFile()) // Work only on UTF-8 // { // if(hFile.ReadLine(sBuffer, sizeof(sBuffer))) // { // PrintToServer("\"%s\"", sBuffer); // } // } bToggleInt16 = true; while(hFile.ReadInt32(iInt32) /* Function File.Read not normal working on Windows! */ ) { do { iInt16 = bToggleInt16 ? (iInt32 & 0xFFFF) : (iInt32 >>> 16); // Сheck whether there is a surrogate couple // if(iInt16 < 0) // { // decl int i2Int16; // hFile.ReadUint16(i2Int16); // iInt16 = 0x10000 + (i2Int16 & 0x03FF) << 10; // } if(iInt16 <= 0x7F) { // if(!s_iTranslation) // { // PrintToServer("1: %u - \"%s\"", iInt16, iInt16); // } if((sBuffer[i] = iInt16) == '\n') { sBuffer[i] = '\0'; PushTranslate(sBuffer, i); i = 0; } else { i++; } } else if(iInt16 <= 0x7FF) { sBuffer[i++] = 0xC0 | iInt16 >> 6 & 0x1F; sBuffer[i++] = 0x80 | iInt16 & 0x3F; // PrintToServer("2: %u - \"%s\"", sBuffer[i - 2], sBuffer[i - 2]); } else if(iInt16 <= 0xFFFF) { sBuffer[i++] = 0xE0 | iInt16 >> 12 & 0xF; sBuffer[i++] = 0x80 | iInt16 >> 6 & 0x3F; sBuffer[i++] = 0x80 | iInt16 & 0x3F; // PrintToServer("3: %u - \"%s\"", sBuffer[i - 3], sBuffer[i - 3]); } else // For 32 bit. { sBuffer[i++] = 0xF0 | iInt16 >> 18 & 0x7; sBuffer[i++] = 0x80 | iInt16 >> 12 & 0x3F; sBuffer[i++] = 0x80 | iInt16 >> 6 & 0x3F; sBuffer[i++] = 0x80 | iInt16 & 0x3F; // PrintToServer("4: %u - \"%s\"", sBuffer[i - 4], sBuffer[i - 4]); } if(i + 5 > sizeof(sBuffer)) { sBuffer[i] = '\0'; PushTranslate(sBuffer, i); i = 0; } } while(!(bToggleInt16 ^= true)); } // PrintToServer("LastSymbol - 0x%X (\"%s\")", iInt16, iInt16); } float flBufferTime = GetEngineTime(); PrintToServer("Parse file \"%s\" (%i bytes) in %f seconds", sFilePath, FileSize(sFilePath), flBufferTime - flStartFileTime); flStartFileTime = flBufferTime; s_iTranslation++; } LogMessage("Successfully parse all translations files in %f seconds.", GetEngineTime() - flStartTime); } static void PushTranslate(char[] sBuffer, const int &iSize) { // if(!s_iTranslation) // { // PrintToServer("%s (%i)", sBuffer, iSize); // } bool bIsSlash = false; int i = 0, iQuoteCount = 0, iQuoteStart = 0, iQuoteEnd = 0, iQuote2Start = 0; while(i != iSize) { if(sBuffer[i] == '/') { if(bIsSlash) { break; } bIsSlash = true; } else if(sBuffer[i] == '"') { if(!iQuoteCount) { iQuoteStart = i + 1; } else if(iQuoteCount == 1) { iQuoteEnd = i; // if(!s_iTranslation) // csgo_english // { // // PrintToServer("s_hLanguageNames -> sName = \"%s\"", sName); // s_hLanguageNames.PushString(sName); // } // else // { // // Very slow // // int iIndex = s_hLanguageNames.FindString(sName); // // if(iIndex != -1) // // { // // // s_hLanguageTextsIndexes[s_iTranslation].Push(iIndex); // // } // // else // // { // // return; // // } // } } else { if(iQuoteCount == 2) { iQuote2Start = i + 1; } else if(iQuoteCount == 3) { break; } } iQuoteCount++; } i++; } int iNameLength = iQuoteEnd - iQuoteStart; if(iNameLength > 0) { char[] sName = new char[++iNameLength]; strcopy(sName, iNameLength, sBuffer[iQuoteStart]); // if(!s_iTranslation) // { // PrintToServer("%s = %s", sName, sBuffer[iQuote2Start]); // } if(strncmp(sName, "[english]", 9) && i - iQuote2Start > 0) { sBuffer[i] = '\0'; s_hLanguageTextMap[s_iTranslation].SetString(sName, sBuffer[iQuote2Start]); } } } ```
Wend4r commented 3 years ago

Please don't write extensions on this basis alone - every OS has some kind of profiling tool.

I tested on Linux

dvander commented 3 years ago

For realtime measurements, you want to use CreateProfiler in profiler.inc. But even better, would be to use Linux's builtin tool "perf". Eg:

perf record -a --call-graph fp [command-line] perf report

There are also GUI tools to help with this, https://www.kdab.com/hotspot-gui-linux-perf-profiler/ is a good one. I believe there is even a way to get scripted functions to show up, but I'm not sure if that code is in the 1.11 branch yet.

asherkin commented 3 years ago

I believe there is even a way to get scripted functions to show up, but I'm not sure if that code is in the 1.11 branch yet.

Yep! Basic support is enabled by default for both SM and spshell.

Wend4r commented 3 years ago

For realtime measurements, you want to use CreateProfiler in profiler.inc. But even better, would be to use Linux's builtin tool "perf". Eg:

perf record -a --call-graph fp perf report

I knew about the profiler and trusted the processor time of GetEngineTime() more, but I didn't know that can use perf to show graphs, thank you.

dvander commented 3 years ago

CreateProfiler guarantees a realtime clock, eg QueryPerformanceCounter on windows and gettimeofday on Linux (though we should probably switch it to clock_monotonic).

Wend4r commented 3 years ago

I asked a friend who also dealt with this issue to send his tests with the extension on Xeon X5680 in the single core with adding strings to the hash table. It’s results:

Parse (csgo_brazilian.txt) ->68 milliseconds
Parse (csgo_bulgarian.txt) ->82 milliseconds
Parse (csgo_czech.txt) ->70 milliseconds
Parse (csgo_danish.txt) ->63 milliseconds
Parse (csgo_dutch.txt) ->62 milliseconds
Parse (csgo_english.txt) ->48 milliseconds
Parse (csgo_finnish.txt) ->67 milliseconds
Parse (csgo_french.txt) ->68 milliseconds
Parse (csgo_german.txt) ->68 milliseconds
Parse (csgo_greek.txt) ->76 milliseconds
Parse (csgo_hungarian.txt) ->67 milliseconds
Parse (csgo_italian.txt) ->68 milliseconds
Parse (csgo_korean.txt) ->33 milliseconds
Parse (csgo_koreana.txt) ->33 milliseconds
Parse (csgo_latam.txt) ->68 milliseconds
Parse (csgo_norwegian.txt) ->64 milliseconds
Parse (csgo_polish.txt) ->71 milliseconds
Parse (csgo_portuguese.txt) ->68 milliseconds
Parse (csgo_romanian.txt) ->68 milliseconds
Parse (csgo_russian.txt) ->70 milliseconds
Parse (csgo_schinese.txt) ->54 milliseconds
Parse (csgo_spanish.txt) ->69 milliseconds
Parse (csgo_swedish.txt) ->66 milliseconds
Parse (csgo_tchinese.txt) ->47 milliseconds
Parse (csgo_thai.txt) ->70 milliseconds
Parse (csgo_turkish.txt) ->69 milliseconds
Parse (csgo_ukrainian.txt) ->73 milliseconds
Parse (csgo_vietnamese.txt) ->66 milliseconds
All Time taken by function: 1811 milliseconds
dragokas commented 3 years ago

@dvander, thanks for useful info )

In my tests, SATA HDD, CSGO files (270 MB.) it took just 0.3 sec. for re-encoding buffer, and 8.3 sec. for files reading. I was wrong when I told hr.Read is slow (I forgot), that was SQLite database pending operations too slow on HDD, where I stored my phrases.

Any ideas about topic, why CSGO throw "Unknown read error 38" ?

Wend4r commented 3 years ago

Any ideas about topic, why CSGO throw "Unknown read error 38" ?

https://gist.github.com/vtellier/e04f78ce29cfb519e49e352fa745d7d1#file-lasterror-h-L60 ReadFile from WinAPI reached EOF

asherkin commented 3 years ago

This looks like it is probably a bug in the engine's implementation of overlapped IO on Windows, can you see if the problem still happens if you set the filesystem_use_overlapped_io hidden cvar to 0?

It looks like you read all the data and the message is just a DevMsg, so you can probably just ignore it. If it is causing problems other than the message it'd be good to expand upon that in the issue description.

dragokas commented 3 years ago

@asherkin sorry for late reply. Unfortunately,

sm_cvar filesystem_use_overlapped_io 0

didn't help to bypass the error message.

And, yeah, there are more issues with valve_fs when reading large file (maybe not related?): https://github.com/alliedmodders/sourcemod/issues/1577

asherkin commented 3 years ago

I don't think there is anything we can do here, our usage of the filesystem API all seems correct (and as you say it works on all other games). I'm going to close this as a Valve issue until something surfaces indicating that SM is using the API incorrectly.