werelate / wiki

wiki code for WeRelate.org
GNU General Public License v2.0
11 stars 10 forks source link

DataQuality enhancements #80

Closed JanetBjorndahl closed 2 years ago

JanetBjorndahl commented 2 years ago

Enhanced functionality and improved performance for the Special:DataQuality page

JanetBjorndahl commented 2 years ago

Thanks, Dallan

Looks good although filtering on my watchlist still times out. I have one other idea to pursue but not right away. I’ll update the Watercooler tomorrow.

Janet

From: Dallan Quass @.> Sent: May 15, 2022 7:53 PM To: werelate/wiki @.> Cc: JanetBjorndahl @.>; Author @.> Subject: Re: [werelate/wiki] DataQuality enhancements (PR #80)

Merged #80 https://github.com/werelate/wiki/pull/80 into master.

— Reply to this email directly, view it on GitHub https://github.com/werelate/wiki/pull/80#event-6612416099 , or unsubscribe https://github.com/notifications/unsubscribe-auth/ANXOWKVJ3RCYDNWO6K6LYDDVKGS67ANCNFSM5V53XKVA . You are receiving this because you authored the thread.Message ID: @.***>

DallanQ commented 2 years ago

FWIW, I ran it tonight and it ran without errors. We could add more indexes if needed. If it would help, I can give you the password to the production machines.

Dallan

On Sun, May 15, 2022 at 8:45 PM JanetBjorndahl @.***> wrote:

Thanks, Dallan

Looks good although filtering on my watchlist still times out. I have one other idea to pursue but not right away. I’ll update the Watercooler tomorrow.

Janet

From: Dallan Quass @.> Sent: May 15, 2022 7:53 PM To: werelate/wiki @.> Cc: JanetBjorndahl @.>; Author @.> Subject: Re: [werelate/wiki] DataQuality enhancements (PR #80)

Merged #80 https://github.com/werelate/wiki/pull/80 into master.

— Reply to this email directly, view it on GitHub < https://github.com/werelate/wiki/pull/80#event-6612416099> , or unsubscribe < https://github.com/notifications/unsubscribe-auth/ANXOWKVJ3RCYDNWO6K6LYDDVKGS67ANCNFSM5V53XKVA> . You are receiving this because you authored the thread.Message ID: @.***>

— Reply to this email directly, view it on GitHub https://github.com/werelate/wiki/pull/80#issuecomment-1127157313, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAB2Y2I76CQZ3YTRXJLXJIDVKGZFXANCNFSM5V53XKVA . You are receiving this because you modified the open/close state.Message ID: @.***>

JanetBjorndahl commented 2 years ago

Hi, Dallan.

Thanks

I don’t think that additional indexes will help. The query is using the primary index on watchlist, which exactly meets the purpose – although I am confused by the execution plan which shows it only using the index for 2 columns (wl_user and wl_namespace) and then noting that there is another condition to match on dq_title = wl_title (the latter of which is also in the index). I have to assume that the optimizer sees some advantage to that – probably in the way the index is stored.

The password to production would help me to confirm my performance testing, which is approximated on my machine. For example, I don’t know how may records to throw into watchlist (and I know better than to run “select count(*) from watchlist” which, if I remember correctly, severely taxed my machine once I loaded a lot of records). I am a bit leery about accidentally overtaxing the production database. It would also help if I could get a user id and password that gave me only SELECT privileges so that I couldn’t confuse sandbox and production and accidentally change data in production.

Question – are you blocking the SpecialDataQuality page during the batch run of the java job? It isn’t working right now. It was designed so that it could continue to be used while the java job is running (using data from the previous job). Let me know.

I have to set up for a meeting – bye for now

Janet

From: Dallan Quass @.> Sent: May 15, 2022 9:35 PM To: werelate/wiki @.> Cc: JanetBjorndahl @.>; Author @.> Subject: Re: [werelate/wiki] DataQuality enhancements (PR #80)

FWIW, I ran it tonight and it ran without errors. We could add more indexes if needed. If it would help, I can give you the password to the production machines.

Dallan

On Sun, May 15, 2022 at 8:45 PM JanetBjorndahl @.***> wrote:

Thanks, Dallan

Looks good although filtering on my watchlist still times out. I have one other idea to pursue but not right away. I’ll update the Watercooler tomorrow.

Janet

From: Dallan Quass @.> Sent: May 15, 2022 7:53 PM To: werelate/wiki @.> Cc: JanetBjorndahl @.>; Author @.> Subject: Re: [werelate/wiki] DataQuality enhancements (PR #80)

Merged #80 https://github.com/werelate/wiki/pull/80 into master.

— Reply to this email directly, view it on GitHub < https://github.com/werelate/wiki/pull/80#event-6612416099> , or unsubscribe < https://github.com/notifications/unsubscribe-auth/ANXOWKVJ3RCYDNWO6K6LYDDVKGS67ANCNFSM5V53XKVA> . You are receiving this because you authored the thread.Message ID: @.***>

— Reply to this email directly, view it on GitHub https://github.com/werelate/wiki/pull/80#issuecomment-1127157313, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAB2Y2I76CQZ3YTRXJLXJIDVKGZFXANCNFSM5V53XKVA . You are receiving this because you modified the open/close state.Message ID: @.***>

— Reply to this email directly, view it on GitHub https://github.com/werelate/wiki/pull/80#issuecomment-1127179996 , or unsubscribe https://github.com/notifications/unsubscribe-auth/ANXOWKQAXFU2NBXEL3W2GETVKG64VANCNFSM5V53XKVA . You are receiving this because you authored the thread.Message ID: @.***>