Open GoogleCodeExporter opened 9 years ago
My personal preference would be a long value type for the internal id's. The
ideal
id would require the absolute minimum cycles to create and read.
Original comment by dubdub1...@gmail.com
on 8 Mar 2009 at 3:47
Agreed. One thing an id should do is perform well.
Of course the id has to be unique as well, so the actions needed to ensure
uniqueness
have to be included when doing perfromance tests.
When using int/long you need to ensure that an id cannot be used twice. Even if
the
application crashes. Otherwise the system cannot work as designed anymore.
So what you need to do is to actually save the last id somewhere. Since you
intend
not to use a db for the core this would mean writing to the filesystem in some
format. Remember this must be done at every id creation to persitently store
the last
id to retrieve it in the event of an application/system crash. Now what if the
crash
is caused by writing the id to the file system or the is some error with the
filesystem in general and the last id file is corrupted. How would one
determine the
next id if the last used id is not known?
To get rid of that issue a mini db could be used to store the last id since dbs
handle filesystem issues themselves (most of them do anyway ;)). But the
performance
penalty still applies.
With Guid you get uniqueness by design. But performance wise the pure
generation of
the Guid is not nearly as good as simply incrementing an int/long. The
difference is
most likely to be substancial!
But including the logic fielsystem writes/db calls) that need to be in place to
insure uniqueness for int/long, the overall performance of a Guid creation
should be
better.
Regarding the differences in reading a long/Guid should be low (8 or 16 bytes)
and
can be disregarded since the biggest issues (by far) when talking about reading
are
string/text values.
Same applies to the amount of storage that is necessary to store the id. The id
is
the minimal part of the "data" that has to be stored (in memory or otherwise).
Original comment by ntzioli...@googlemail.com
on 8 Mar 2009 at 6:22
To use GUID and the performance impact on the entire system even extended
implementations makes it use questionable. It would make more sense to have a
queu
called "id" and a single field that stores last id. At start up you generate
1000 IDs
and write number 1000 to the last id value on the db. if the queu drops to 200
you
add 800 new ID to the que and update the last id value. This would ensure that
your
ID value stays unique. A test would be best to do this with. I would say we
test GUID
vs long in a database table with 100'000'000 rows too.
I'm not completely against using GUID as an identifier but speen is more
important.
The real question is why does internal IDs need to be unique over different
sessions?
Any database or xml writes will be performed by a higher level where the impact
of ID
generation is much less of an issue. The internal reference IDs sole purpose is
to be
the most efficient, fastest internal lookup identifier possible, and starting
from 0
will use the smallest memory blocks possible.
Original comment by dubdub1...@gmail.com
on 8 Mar 2009 at 11:28
Original issue reported on code.google.com by
ntzioli...@googlemail.com
on 8 Mar 2009 at 4:51