X-Sharp / XSharpPublic

Public repository for the source code for the XSharp Compiler, Runtime, Project System and Tools.
Apache License 2.0
114 stars 38 forks source link

RDD slowness with indexes #1644

Open cpyrgas opened 6 days ago

cpyrgas commented 6 days ago

https://www.xsharp.eu/forum/topic?p=32015#p32015

When appending records with an index active, time taken in X# grows exponentially with the number of records in the dbf. In VO, it grows linearly, so for a big amount of records, the difference in speed becomes huge. The following code that can be run in both VO and X# demonstrates this. Note that when not using an index, time taken in both VO and X# increases linearly.

FUNCTION Start() AS VOID STRICT
    LOCAL cDBF AS STRING
    LOCAL aFields AS ARRAY
    LOCAL i AS DWORD

    LOCAL nStart, nElapsed AS REAL8

    RddSetDefault ( "DBFCDX" )

    cDBF := "C:\test\mytest"
    FErase ( cDBF + ".cdx" )

    aFields := { { "LAST" , "C" , 20 , 0 } }

    DbCreate( cDBF , aFields)
    DbUseArea( ,,cDBF )

    DBCreateOrder("last", "C:\test\mytest.cdx", "last")   // comment this out and it will be faster

   // wait

    nStart := Seconds()

    ? "start : " + ntrim(nStart)

    FOR i := 1 UPTO 1000000
        DbAppend()
        fieldput(1, asstring(i))
        // ? i
    NEXT

    nElapsed := Seconds() - nStart

    ? "Elapsed: " + ntrim(nElapsed) + " seconds" + " - Minutes: " + ntrim(nElapsed/60.00)

    wait
RobertvanderHulst commented 6 days ago

I compared the code in VO with X#. In VO almost no data is written for the DBF during the whole process. All data is written at the end of the loop. In X# each append results in:

RobertvanderHulst commented 6 days ago

The exponential growth is probably caused by later insertions introducing new levels in the index tree, so a key insert results in multiple pages being written. In VO the index pages are also kept in memory, so that is less of an issue

cpyrgas commented 5 days ago

Can we do the same thing with keeping stuff in memory in X#?

But in any case, I am not sure how much of an impact such an improvement will have in a real life scenario, as adding 100,000 or a million records at once is not your everyday thing..

Wolfgang told me he has big slowness in other cases, I have asked for a sample.