publiclab / plots2

a collaborative knowledge-exchange platform in Rails; we welcome first-time contributors! :balloon:
https://publiclab.org
GNU General Public License v3.0
961 stars 1.83k forks source link

Wrong wiki count in Dropdown by type #7334

Closed keshavsethi closed 4 years ago

keshavsethi commented 4 years ago

In https://publiclab.org/contributors/newsletter dropdown by type Wiki pages is showing 3 but on clicking it shows only 1 Image is shown below Screenshot from 2020-01-23 17-47-30

keshavsethi commented 4 years ago

If we click on by Views button then it shows all 3 but again if try to sort all 3 of them by title it again shows 1 newsletter

gabrielbaldao commented 4 years ago

Can I try fix this?

keshavsethi commented 4 years ago

yes, you can try this but before doing this get it reviewed by @SidharthBansal @nstjean @jywarren

SidharthBansal commented 4 years ago

Sure. Please provide a fix

NitinBhasneria commented 4 years ago

@gabrielbaldao are you doing this or can I proceed?

SidharthBansal commented 4 years ago

Go ahead

jywarren commented 4 years ago

Note we should make this match the code and tests here: https://github.com/publiclab/plots2/pull/7476

and the idea for a "single source of truth" here: https://github.com/publiclab/plots2/issues/6855#issuecomment-578331475

Thank you!

jywarren commented 4 years ago

And noting a more detailed investigation here -- I think we need to be sure all counts are from a unified body of code maintained in the models, and unit tested!

https://github.com/publiclab/plots2/issues/7965#issuecomment-644980326

jywarren commented 4 years ago

So we should confirm that all counts are coming from the respective Tag.count style methods. Key areas are here:

https://github.com/ckchu8/plots2/blob/0fbcc07bd61fd238d88779e49a3fbd2e300ef041/app/controllers/tag_controller.rb#L542-L561

And on tag cards, for example on https://publiclab.org/tags

jywarren commented 4 years ago

Just noting that the original issue at top may be a special case. I see in the DB that we're fetching these three pages:

Only the middle one is the correct one. The first may be some kind of strange clone... not sure. From the early days of the site? And the last, doesn't look like a page anymore?

Here are the full records:

irb(main):005:0> Node.for_tagname_and_type('newsletter', 'page', wildcard: false).last => #<Node nid: 4965, vid: 18544, type: "page", language: "", title: "Newsletter", uid: 1, status: 1, created: 1353450804, changed: 1484959036, comment: 0, promote: 0, moderate: 0, sticky: 0, tnid: 0, translate: 0, cached_likes: 1, comments_count: 0, drupal_node_revisions_count: 5, path: "/wiki/newsletter", main_image_id: nil, slug: "newsletter", views: 1471, latitude: nil, longitude: nil, precision: nil>

irb(main):006:0> Node.for_tagname_and_type('newsletter', 'page', wildcard: false).first => #<Node nid: 308, vid: 20121, type: "page", language: "", title: "Subscribe", uid: 1, status: 1, created: 1306885186, changed: 1484959009, comment: 0, promote: 0, moderate: 0, sticky: 0, tnid: 0, translate: 0, cached_likes: 0, comments_count: 0, drupal_node_revisions_count: 13, path: "/subscribe", main_image_id: nil, slug: "subscribe", views: 25, latitude: nil, longitude: nil, precision: nil>

irb(main):007:0> Node.for_tagname_and_type('newsletter', 'page', wildcard: false)[1] => #<Node nid: 4965, vid: 18544, type: "page", language: "", title: "Newsletter", uid: 1, status: 1, created: 1353450804, changed: 1484959036, comment: 0, promote: 0, moderate: 0, sticky: 0, tnid: 0, translate: 0, cached_likes: 1, comments_count: 0, drupal_node_revisions_count: 5, path: "/wiki/newsletter", main_image_id: nil, slug: "newsletter", views: 1471, latitude: nil, longitude: nil, precision: nil>

Very mysterious!

jywarren commented 4 years ago

Jeanette from PL noted this page also has an elevated number of wiki pages listed on the tab count:

image

83 is a much bigger error than what @keshavsethi had found, as only 6 wiki pages are shown. We can go into the database directly to just skim what those wiki pages actually are, in the query returning 86, and compare the queries too.

jywarren commented 4 years ago

OK! The disparity, for the method tag, is these different queries:

irb(main):002:0> Node.for_tagname_and_type('method', 'page', wildcard: false).count
=> 83
irb(main):001:0> Tag.find_nodes_by_type('method','page').count
=> 5

I'll look at the code behind them to confirm what's going on.

Here is the tab count: https://github.com/publiclab/plots2/blob/92834ead6c53d6d9431bb99024d4044b282518be/app/controllers/tag_controller.rb#L542-L561

It references Node.for_tagname_and_type:

https://github.com/publiclab/plots2/blob/122e8a73b440bb752f080375aad3f24a02b7c8f9/app/models/node.rb#L621-L630

By contrast, Tag.find_nodes_by_type is here:

https://github.com/publiclab/plots2/blob/122e8a73b440bb752f080375aad3f24a02b7c8f9/app/models/tag.rb#L94-L116

Noting that the first 6 items on both https://publiclab.org/wiki/tag/method and https://publiclab.org/methods are the same, but /wiki/tag/method cuts off after 6. A clue? 🕵️

jywarren commented 4 years ago

Both segments of code are significantly complicated by parent tag references.

jywarren commented 4 years ago

OK, so the difference is in the limit term:

irb(main):010:0> Tag.find_nodes_by_type('method','page', false).count
=> 83
irb(main):011:0> Tag.find_nodes_by_type('method','page').count
=> 5

Let's track where the limit is being set on https://publiclab.org/wiki/tag/method

But, a follow-up could be to consolidate this code and/or remove the parent functionality.

jywarren commented 4 years ago

So, the query for the listing of wiki pages in the table below on /wiki/tag/method is: https://github.com/publiclab/plots2/blob/92834ead6c53d6d9431bb99024d4044b282518be/app/controllers/tag_controller.rb#L117

I /believe/ none of these have parent tag issues, having checked in the db. So we're left with a strange mystery of pagination, which I can't quite figure out.

Each of these pages show a different collection, with some overlapping items:

We can probably get all 83 pages this way. So it could be pagination being missing -- we could activate by adding @pagination = true in the controller, to activate this line:

https://github.com/publiclab/plots2/blob/92834ead6c53d6d9431bb99024d4044b282518be/app/views/wiki/_wikis.html.erb#L30

But there's something else -- why is Odor Logging and Stormwater Monitoring the top entry for both the first couple pages? And why does Facilitate your meetings show up again on pages 3 and 4? Something is going on with our query.

jywarren commented 4 years ago

Let's fix pagination, and then return here to see what we can do about the repeated display of some items, as well as figure out why page 1 shows only 6, but later pages show a full 24 (i think?)???

jywarren commented 4 years ago

VERY strange -- https://github.com/publiclab/plots2/pull/8243#issuecomment-668782325 --- pagination mysteries persist...

jywarren commented 4 years ago

OK, noting from this comment that the /tag/____ page is cached every 5 mins.

jywarren commented 4 years ago

Indeed:

> Node.for_tagname_and_type('method','page',question:false).paginate(page: 1,per_page: 24).order('node_revisions.timestamp DESC').length
=> 4
jywarren commented 4 years ago

It seems to be the mix of .order('node_revisions.timestamp DESC') and .paginate() that is causing this. If I remove one, it starts to return 24 items:

irb(main):089:0> Node.for_tagname_and_type('method','page',question:false).paginate(page: 1,per_page: 24).order('node_revisions.timestamp DESC').length
=> 4
irb(main):090:0> Node.for_tagname_and_type('method','page',question:false).paginate(page: 1,per_page: 24).length
=> 24
irb(main):091:0> Node.for_tagname_and_type('method','page',question:false).order('node_revisions.timestamp DESC').paginate(page: 1,per_page: 24).length
=> 4
jywarren commented 4 years ago

https://github.com/publiclab/plots2/blob/f38bd542c380185f504a82b2806a49995ef0f736/app/models/node.rb#L626-L629

jywarren commented 4 years ago

I believe this may be an issue with our pagination library, will_paginate --

jywarren commented 4 years ago

Maybe we look at

jywarren commented 4 years ago

OK, we've installed pagy gem for pagination in #8326 - let's see how this does in staging now.

jywarren commented 4 years ago

Confirming that https://unstable.publiclab.org/wiki/tag/method now shows https://unstable.publiclab.org/wiki/tag/method?page=2, https://unstable.publiclab.org/wiki/tag/method?page=3, https://unstable.publiclab.org/wiki/tag/method?page=4 with all entries.

That resolves this original issue

(we should count the results to be sure)

however, I now notice that, strangely, the first page lists only 4, while later pages show the full max per-page limit:

image

jywarren commented 4 years ago

OK, we are doing better... 32 are listed. But the total should be 83!

13 are listed per page, so if the first page were missing, say, 9 to make a full 13, we'd still only be at 41 total. Let me try to get a list of the full 83 to check discrepancies...

jywarren commented 4 years ago

Here are all 83:

["bucket-monitor", "issue-brief", "facilitation", "air-sampling", "stormwater", "odor", "noise", "infragram", "raspberry-pi-infragram", "image-sequencer", "mapknitter", "purpleair", "thermal-flashlight", "pole-mapping", "optical-monitoring-of-particulate-matter", "simple-water-sensor-platform", "anonymity", "water-quality-sensor", "micro", "site-survey", "sampling", "pipeline-monitoring", "simple-air-sensor", "public-comment", "babylegs", "testing-our-waters", "passenger-pigeon", "coqui", "minivol", "emery-board", "indoor-air-quality-mapping", "turbidity", "start-enviro-monitor-study", "spectral-workbench", "soil-testing-toolkit", "filters", "nano-data-logger", "diy-indoor-air-quality-remediation-kit", "pole-mapping-kit-instructions", "balloon-mapping", "guides", "spectrometry", "conductivity", "particle-sensing", "microscopes", "webjack", "hydrogen-sulfide-copper-pipe", "data-logging", "filter-based-monitoring-of-particulate-matter", "kite-balloon-hybrid", "host-an-event", "water-sensors", "hydrogen-sulfide-photopaper", "hydrogen-sulfide-sensor", "dustduino", "timelapse", "sensor-enclosures", "riffle", "reagents", "thermal-photography", "photo-sharing", "water-sampling", "soil-sampling", "photo-monitoring", "biobroth-bubbler", "creating-a-media-campaign", "fido-the-raspberry-pi-based-temperature-alarm-that-sends-text-messages", "pdr-1500", "potentiostat", "kaptery-aerial-rigs", "aerial-photography", "mae-d-agua", "multispectral-imaging", "literature-review", "health-effects-lit-review", "tech-review", "refinery-watching", "gardening-toolkit", "air-column-monitor", "portable-energy-scavenging-kit", "environmental-estrogen-testing", "balloon-telemetry-kit", "stereo-camera"]

jywarren commented 4 years ago

OK - theory - just confirmed that the tag pin:test can remove a wiki page from being displayed on /wiki/tag/test (in gitpod). How many are being hidden this way???

jywarren commented 4 years ago

None from the 83. I'm going to try starting a development environment from staging and to try logging out the contents of the pagy() pagination collection. Very mysterious that only 4 appear on the first page.

jywarren commented 4 years ago

wow let's try this! https://ddnexus.github.io/pagy/how-to#custom-count-for-custom-scopes

@pagy, @records = pagy(custom_scope, count: custom_count)
jywarren commented 4 years ago

also noting https://ddnexus.github.io/pagy/extras/arel

jywarren commented 4 years ago

OK, ready to try this in unstable with:

source ../environment-dev.sh && docker-compose exec web passenger start

jywarren commented 4 years ago

custom count didn't work w/ nodes.count -- will try reproducing in unstable on the command line...

jywarren commented 4 years ago

OK, found a deep inconsistency:

irb(main):042:0> @pagy, nodesp = pagy(nodes.order('node_revisions.timestamp DESC')) # same as in controller
irb(main):048:0> nodesp.page(1).size
=> 20
irb(main):049:0> nodesp.page(1).collect(&:title)
=> ["Bucket Monitor", "Write an Issue Brief", "Facilitate your meetings", "Air Sampling"]
irb(main):050:0> nodesp.page(1).size
=> 20

Whoa....

irb(main):072:0> nodesp.page(1).length
=> 4
irb(main):073:0> nodesp.page(1).size
=> 20
irb(main):074:0> nodesp.page(1).count
=> 83
irb(main):075:0> nodesp.page(2).size
=> 20
irb(main):076:0> nodesp.page(3).size
=> 20
irb(main):077:0> nodesp.page(4).size
=> 20
irb(main):078:0> nodesp.page(1).size
=> 20

irb(main):079:0> nodesp.page(1).length
=> 4
irb(main):080:0> nodesp.page(2).length
=> 13
irb(main):081:0> nodesp.page(3).length
=> 6
irb(main):082:0> nodesp.page(4).length
=> 9
irb(main):083:0> nodesp.page(5).length
=> 10
irb(main):084:0> nodesp.page(6).length
=> 13
irb(main):085:0> nodesp.page(7).length
=> 8
irb(main):086:0> nodesp.page(8).length
=> 4

This totally doesn't match the displayed # of posts, which is:

jywarren commented 4 years ago

OK, it appears that the wild pagination occurs when running:

@pagy, nodesp = pagy(nodes.order('node_revisions.timestamp DESC'))

but not:

@pagy, nodesp = pagy(nodes.order('node.nid DESC'))

so i propose that as a short-term fix. This will sort them by descending nid which is very close to order of creation. Then we can deprioritize this - i'll leave a comment explaining: https://github.com/publiclab/plots2/commit/2a7d10030165b9b351391315600c944cb4198782