lino-framework / eidreader

eidreader Python script
http://eidreader.lino-framework.org/
GNU Affero General Public License v3.0
15 stars 6 forks source link

Speed performance issue #12

Open FlorentGhilain opened 8 months ago

FlorentGhilain commented 8 months ago

Hi,

First thanks for the library,

In our business case we only need to read partially the BEID data (ie. national_number, surname and firstnames). I spotted that in any case, we are reading near everything via objs = sess.findObjects([(CKA_CLASS, CKO_DATA)]) and that's why the process can take some times.

Is there a way to speed up the read by selecting only the fields we are interested in ? (Shoud we dig on the PKCS11 API side, any hint ?)

Thanks

lsaffre commented 8 months ago

Pleased to read that it's useful to others as well. I am quite convinced that the photo is the only field that actually takes a noticeable time. So I imagine a command-line option --nophoto would be the solution to your problem.

lsaffre commented 8 months ago

Yes, as you said, we need to dig into the PKCS11 API and then review our two calls to sess.findObjects() to make them more granular. Ideally we should add test cases before doing this change, but that would be another issue.

FlorentGhilain commented 8 months ago

@lsaffre Thanks for the quick advice!

I found the way to do it by digging in the samples offered by the python library PyKCS11.

We can simply add more template filter to the findObjects method.

fields_filter = [
    (CKA_LABEL, "national_number"),
    (CKA_LABEL, "surname"),
    (CKA_LABEL, "firstnames"),
]
...
objs = []
for field_filter in fields_filter:
  objs += sess.findObjects([(CKA_CLASS, CKO_DATA), field_filter])

It took now ~100ms compared to the previous ~2/3 seconds.

At the moment I hardcoded the fields we really need, but we could easily pass them as a command line parameter and only retrieve those label. If i find some times, I will open a PR for this feature.

lsaffre commented 8 months ago

Yes, that looks great. Yes, a PR would be great.