Closed ws-garcia closed 3 years ago
I am able to parse that file using my FromCSVFile
method. See the below screenshot:
If you upload the file you were testing in for me to take a look at I will do so but at the moment its a "works for me" situation, especially if I can't repro and it sounds like you have edited my class in some way(?).
If I cant reproduce I'll have to close the issue but as I say happy to take a look at any file you have.
I assume your issue is that the peak memory usage exceeds 4GB. I assume you are running a 32Bit application and therefore run out of available RAM?
it sounds like you have edited my class in some way(?)
Your class module is untouched.
Here is the code I'm using
Sub SENIPAHtest(ByVal filePath As String)
Dim Senipah As BetterArray
Dim csv As CSVinterface
Dim CSVdata As String
Dim avg As Double
Dim e As Double, f As Double
'Test SENIPAH method
Set csv = New CSVinterface
CSVdata = csv.GetDataFromCSV(filePath)
Set csv = Nothing
Set Senipah = New BetterArray
avg = 0
e = Timer
Call Senipah.FromCSVString(CSVdata)
f = Timer
avg = avg + CDbl(f - e)
Set Senipah = Nothing
Debug.Print "[Senipah FromCSVString:"; Round(avg, 4); "seconds]"
End Sub
And here using the FromCSVFile
method
Sub SENIPAHtest(ByVal filePath As String)
Dim Senipah As BetterArray
Dim avg As Double
Dim e As Double, f As Double
'Test SENIPAH method
Set Senipah = New BetterArray
avg = 0
e = Timer
Call Senipah.FromCSVFile(filePath)
f = Timer
avg = avg + CDbl(f - e)
Set Senipah = Nothing
Debug.Print "[Senipah FromCSVString:"; Round(avg, 4); "seconds]"
End Sub
Now I'm getting a "Out of memory error" on a machine running Win 10 x64, with 8GB RAM, Excel 2019 32bits. I'm still surprised!
I assume your issue is that the peak memory usage exceeds 4GB. I assume you are running a 32Bit application and therefore run out of available RAM?
This can be the problem here.
The question here is only one: it’s possible to deal with the memory consumption, when working with String data type, for your solution?
I think there is nothing to do with my issue!
Alright, so it sounds like the issue is that because mine uses more auxiliary memory while parsing that 1.6m row file than is available to 32Bit Excel.
Out of interest I converted mine to return a string array rather than store a variant array in the class and that did bring it's peak usage below 4GB. Downloadable here: StringCSVTest.zip
While my solution uses more auxiliary memory than yours and the other solution you benchmark against I am a bit unwilling to refactor this method any more at this time as I think that the chance of anyone in the real world seeking to parse a file that large using my class is both unlikely and ill-advised. As you know, CSV parsing isn't my library's main focus: it exists as a utility, and it does that job acceptably at the moment I think.
That said, If you have any suggestions or want to submit a PR to try and improve the performance of my parser (while maintaining the functionality of the class as a whole) I am definitely willing to review.
and it does that job acceptably at the moment I think.
Your parser do the parsing task in a outstanding way! The variations in the CSV syntax don't affect its performance and, for me, this is stunning!
Is your feature request related to a problem? YES
Please describe. Dear @Senipah, it’s me, AGAIN! I was testing the CSV Interface class trying to find the memory limits of it and, in the same path of work, check if the change from
String
array toVariant
jagged array has the potential to make the class run out of memory for a CSV file of couple of megabytes of size. I used both your solution and sdkn's as benchmarks. The test’s objective was invoking the import procedure, bypassing the result in the case of @sdkn104, and avoiding copies of arrays in the case of your project and mine, so that the imported data was only present in the class members. As a result of the above, I get a surprising error when invoke yoursFromCSVString
method. It’s about the error 14: “Out of string space”.Describe the solution you'd like The question here is only one: it’s possible to deal with the memory consumption, when working with
String
data type, for your solution?Additional context Attached to this I let you the file used as “guinea pig” (Demo_1.6MM_records.zip), hoping you can find the error source, and, additional to that, some screen shoots with the memory usage for both, @sdkn104 and mine, solutions.
NOTE: due to file size, the Demo_1.6MM_records.zip contains a .7z file inside.