cristianbuse / VBA-FastDictionary

Fast Native Dictionary for VBA compatible with Windows and Mac
MIT License
27 stars 4 forks source link

[Documentation] Some comments on docs #2

Closed sancarn closed 6 months ago

sancarn commented 7 months ago

First off, very glad you made this. Your ideas mirror a lot of what I was planning for stdDictionary but executed in a manner significantly better than what I could have patience for! Also awesome it's self contained with no dependencies! Bravo! šŸŽ‰

Moreover, most likely no one would ever rely on this kind of functionality considering the Exists method

Definitely haven't relied on this functionality before šŸ‘€šŸ‘€šŸ‘€ But you're right, the reality is people should use exists. I have used this in the past when my dictionary is purely truthy:

if dict(someKey) then ...

Given Empty is coerced to false. Perhaps an option could be added to the dictionary for this uncommon case šŸ˜›

To avoid this unfortunate and misleading behaviour, this Dictionary distinguishes between these values and allows all 3 at the same time.

Not sure this is clear. Are you insinuating that Empty, "" and 0 can be used simultaneously for 3 separate values? I.E.

dict("") = 1
dict(Empty) = 2
dict(0) = 3
debug.print dict("") '1
debug.print dict(Empty) '2
debug.print dict(0) '3

Scripting.Dictionary.HashVal ...

A super interesting read, but also a concern in the scenario that scripting dictionary ever does get fully removed from Windows OS... Unless I'm mistaken? I suspect the only alternative/mitigation would be a thunk correct?

cristianbuse commented 6 months ago

Thanks @sancarn for the feedback and the kind words!

1

if dict(someKey) then ...

I initially thought of having a STRICT_COMPATIBILITY compiler constant and have everything match the way Scripting.Dictionary works. But I quickly discarded this because I do not want to perpetuate this behaviour. Same for the casting to Single incompatibility.

First problem with something like if dict(someKey) then ... is speed. Running this:

Sub TestExistsSpeed()
    Dim c As Currency
    Dim d As New scripting.Dictionary
    Dim i As Long
    Dim res As Boolean
    Const initialSize As Long = 100
    Const sizeToCheck As Long = 1000000
    '
    For i = 1 To initialSize
        d.Add i, i
    Next i
    '
    c = AccurateTimerMs
    For i = 1 To sizeToCheck
        res = d.Exists(i)
    Next i
    Debug.Print "Exists: " & Round(AccurateTimerMs - c, 0)
    '
    c = AccurateTimerMs
    For i = 1 To sizeToCheck
        res = CBool(d(i))
    Next i
    Debug.Print "Item (Get): " & Round(AccurateTimerMs - c, 0)
End Sub

I get these results (milliseconds):

Exists: 69
Item (Get): 15935

Increasing the number of checks by a factor of 10 results in 716 ms for Exists and a freeze of my computer for the Item approach - I crashed the app after 15 minutes.

Second problem with something like if dict(someKey) then ... is that it raises an error if the item returned is an Object.

Do you still think I should add the functionality for compatibility only?

2

Are you insinuating that Empty, "" and 0 can be used simultaneously for 3 separate values?

Yes, exactly. Would it be clearer if I add your code snippet?

3

A super interesting read, but also a concern in the scenario that scripting dictionary ever does get fully removed from Windows OS... Unless I'm mistaken? I suspect the only alternative/mitigation would be a thunk correct?

No need for a thunk. On Mac, this Dictionary already iterates char codes via a fake array of integers. The only reason I used the scripting dictionary's HashVal on Win is because it works better for lengthier text.

I can make a check and see if the CreateObject returns Nothing and then add an If statement to use the iteration hash if scripting dict not available. I actually had this check at some point but then removed it because I do not want to do this at runtime - because of the speed penalty of the extra If statement.

Will think about this again but if you have any ideas then please let me know.

cristianbuse commented 6 months ago

Hi @sancarn ,

I used your code snippet here. This hopefully makes it more clear.

This makes sure hashing still works if Scripting.Dictionary is not available. The extra If does have a speed penalty but I guess it's much safer to have a backup if people really start using this Dictionary.

What remains is, do we really want to implement an option for strict compatibility? I would love to hear your thoughts on this, especially after the examples I provided in my previous reply.

Many thanks!

sancarn commented 6 months ago
    c = AccurateTimerMs
    For i = 1 To sizeToCheck
        res = d.Exists(i)
    Next i
    Debug.Print "Exists: " & Round(AccurateTimerMs - c, 0)
    '
    c = AccurateTimerMs
    For i = 1 To sizeToCheck
        res = CBool(d(i))
    Next i

To be fair this is a bit like comparing apples with oranges. if dict(key) then isn't equivelnt to if dict.exists(key) then but rather if dict.exists(key) then if dict(key) then ... - does depend on the circumstances of course...

I think an option for compatibility is probably a good idea, not to give you more work šŸ˜›

Would it be clearer if I add your code snippet?

Yeah I think with the snippet it would be clear enough.

I can make a check and see if the CreateObject returns Nothing and then

Makes total sense.

but then removed it because I do not want to do this at runtime

Yeah I can totally sympathise, that would be my concern too. Not sure I really know a way around it either (apart from compiler constants, which have other downsides). I wonder if you could instead of using HashVal use a pointer to a function in the class or something... Would probably be significantly slower though :/

What remains is, do we really want to implement an option for strict compatibility? I would love to hear your thoughts on this, especially after the examples I provided in my previous reply.

Yeah another good question, realistically I think the current incompatibilities for the most part are good tradeoffs, and wouild usually be considered bugs anyway. For instance dict(Empty) should be different to dict("") in my opinion, so glad that is the case in this dictionary. This may break some stuff, but it's not an extremely difficult thing to work around either.

cristianbuse commented 6 months ago

Thanks @sancarn

Yeah I think with the snippet it would be clear enough.

Already added here

Yeah I can totally sympathise, that would be my concern too. Not sure I really know a way around it either (apart from compiler constants, which have other downsides). I wonder if you could instead of using HashVal use a pointer to a function in the class or something... Would probably be significantly slower though :/

Already added here. The speed loss is not that much (ran a few speed tests) and I certainly prefer having the extra failsafe. I discarded compiler constants because I prefer it works without intervention.

Yeah another good question, realistically I think the current incompatibilities for the most part are good tradeoffs, and wouild usually be considered bugs anyway

Yeah, I think they are good tradeoffs and would rather not implement a strict option, at least for now.

Many thanks for your feedback!