Open christian-draeger opened 5 years ago
@SimonLab Do you think you could take a look at this?
It should just be a matter of updating the query to be DISTINCT
.
Watch out for this gotcha: https://stackoverflow.com/questions/11250253/postgresql-countdistinct-very-slow
And if we can do it using idiomatic Ecto
(without returning all the data to the Phoenix App and wasting bandwidth) it would be even better!
@nelsonic Looking at this now
At the moment the count
value is returned when a hit is inserted.
So we'll need to update the aggregate function:
https://github.com/dwyl/hits/blob/b60b329b4096ac690af00cc3391399a7ede79916/lib/hits/hit.ex#L24-L30
I'm thinking of adding a filter
query parameters to the url (as mention on the first comment): ...svg?filter=day
@SimonLab good plan. The only thing to note is at present the Ecto Repo.aggregate
query is somehow being ultra-inefficient and thus consuming a ton of bandwidth; see: https://github.com/dwyl/learn-devops/issues/72#issuecomment-958836871
Basically we might need to write a "Raw SQL" query to avoid sending all the data from PostgreSQL to Phoenix each time ... 💭
We really want to streamline the query to do the COUNT DISTINCT
on the PostgreSQL server to only return the int value rather than all the data. 🤞
I think the aggregate Ecto query is already using the count distinct
and only the count is sent back to Phoenix. This is the generated query:
SELECT count(s0."id") FROM (SELECT DISTINCT ON (sh0."useragent_id") sh0."id" AS "id" FROM "hits" AS sh0 WHERE (sh0."repo_id" = $1)) AS s0 [1]
@SimonLab cool. thanks for confirming. 👍
Hey, i'm wondering if it's possible to configure the badge to only show unique hits (over a day or specific period of time). This option would give a more meaningful hint to repo owners that are having the "hits-badge" added to their readme files.
As I saw you already storing the hit data including a hash (consisting of user-agent, IP and timestamp) which in turn means you already know if a hit is unique or not - why not adding this option to the badge? :)
for instance something like
http://hits.dwyl.io/skrapeit/skrape.it.svg?uniquePerDay
would be handy.i think what bothers me the most is that every time you adjust your readme (and have the hit-badge included in the readme) IntelliJ's preview mode for md files will render the readme. This causes every keystroke to increment the hit count. Which leads to the issue that the hit-count is extremely falsified and no longer has any significance to the repo owner. here is a screen record that describes the problem: