Closed brandonaaskov closed 4 months ago
The total
count is from Gmail search results, and as with all Google searches, it is an indicational number, not a "true" total results count. As long as there is a page token for the next page you can continue paging, even if the page number count says otherwise. Gmail API has no support at all for numeric paging, you can only use cursors. If you need to use page number-based paging, then you should use the IMAP+SMTP-based connection, not Gmail API.
The cursor based paging is not a breaking change because previously it was not possible to use Gmail API as the email backend. But yes, I need to make this more clear in the API documentation.
@andris9 I disagree, and maintain it is a breaking change but only because it's the API layer. That's the normalized layer where I don't need to care about the underlying implementation (Outlook, Gmail, IMAP, etc). Just because the Gmail API is being used, it shouldn't change what total
means. As it stands, total
behaves differently for just this one implementation. So as a user of the API, I have to now fork my code to handle the different meaning of what total
is for Gmail accounts.
Plus, even if was coming from search results, there are 100 items returned in the array, so this "indicational number" of 201 literally is indicative of nothing: I have no idea where it's coming from or why it's 201. There are 495 messages in total, 100 per page. And yet I'm seeing that there are 3
pages even though I can fetch 4, and the total is 201
until the last page is fetched and then the number is accurate at 95
. This is, 100%, a bug and a breaking change.
From your own blog post, it's pretty evident that the scopes required for the IMAP-based connection to Gmail will only work for "internal" apps.
The minimum permission set requirement is probably the one that will sink your application to get EmailEngine integrated with Gmail accounts. EmailEngine requires access to the highly restricted "https://mail.google.com/" OAuth2 scope. This is the only scope that allows access to IMAP and SMTP – the protocols that EmailEngine uses.
Unfortunately, Google would probably consider that scope too wide for whatever use case you have and ask you to use more restrictive scopes. These restrictive scopes, in theory, give you access to the required data but not to IMAP and thus are unusable by EmailEngine. If you can convince Google that the features you need are only available via IMAP, then you might pass the review. Obviously, there are no guarantees.
The same posts then says "If you can afford it and you are able to weasel yourself through the verification process, go with the public OAuth2." It's clear, and I appreciate it, that this option is a non-starter for anything besides an internal-only use-case.
You even say in this post "However, IMAP and SMTP are not always the most suitable options. Recognizing this, we have been working on adding additional email backends to EmailEngine", which again, I appreciate. But I still think the API should return the right value for total
and pages
because right now they're wrong.
Fwiw I am a paying customer, and I'm a little miffed that you dismissed this issue so easily without even reading it because if you had, you would have seen the error with pages
being incorrect and total
being incorrect until the last page is fetched.
Unfortunately, it is not possible to get the actual message count from Gmail API. EmailEngine asks from the API "how many emails match this query?" and the server responds with "201, maybe, who knows 🤷♂️". So the correct option would probably be to remove the pages
and total
values from the output when using the API backend because these are almost never correct.
This is only a concern for me during the on-boarding phase of my app, and in my case all I've really lost is the ability to show a progress bar (when I know the total) instead of a spinner. Not the end of the world for my use case.
But I'm curious @andris9, why is the total
value always correct on the last page for these Gmail API accounts? The last page's total
is always correct, even if pageSize
is so large it returns everything in the first page.
The total
value is provided by Gmail. EmailEngine has no way at all to know what or why that value means. It’s a magnitude approximate. EmailEngine uses the total value to calculate the pages
value, and if the total is wrong, then the pages value is wrong as well. The only trustable value is nextPageCursor
- as long as it is set, you can continue paging the results.
@andris9 I was previously building my own Gmail and Outlook integrations before finding EmailEngine. Here's a very basic test that can be run by anyone to verify this issue on the Gmail side:
import 'dotenv/config'
import { gmail_v1, google } from 'googleapis'
import { OAuth2Client } from 'googleapis-common'
async function example(accessToken: string, page?: string) {
const auth: OAuth2Client = new google.auth.OAuth2(
process.env.GMAIL_CLIENT_ID,
process.env.GMAIL_CLIENT_SECRET,
`http://localhost:3000/oauth`
)
auth.setCredentials({ access_token: accessToken })
const gmail: gmail_v1.Gmail = google.gmail({ version: 'v1', auth })
do {
const response = await gmail.users.messages.list({
userId: 'me',
q: 'in:sent',
maxResults: 500, // "typical" max limit for gmail
pageToken: page,
})
// ...do stuff with page and/or messages
console.log(response.data.resultSizeEstimate)
page = response.data?.nextPageToken
} while (page)
}
const accessToken = // you can get it from /v1/account/{account}/oauth-token
example(accessToken)
The only time it gets it right is when the total results is less than maxResults
. @andris9 you would know better than me, where is the best place for me to file this issue with Google (or at least back you up if you've already raised the issue)?
It is not a bug on Google's side. The value is called an estimate (resultSizeEstimate
), so it does not have to be exact. They can probably not return you the actual result size, as calculating it would be too slow or resource intensive, and instead use some kind of estimation algorithm to come up with the number.
@andris9 that's fair, I can see why they cleverly named it that way. But let's just say there was a guy out there that had a loud mouth and wanted to convey to Google that this should be fixed. Where would be the best place for a guy like that go directly and maybe be heard?
Describe the bug I'll show a few screenshots fetching for the same account ID to illustrate the error.
pageSize=1000
Here the paging data is accurate because there is only one page:total: 495
pageSize=100
(you can also remove the param entirely) 3 pages, 100 per page, but a total of 201?Using
cursor
... On page 3 of 3 but there's still anextPageCursor
value, total is still incorrect at201
If I use the
nextPageCursor
for the fourth page, I do get messages back andnextPageCursor
isnull
, but thepage
andpages
values are very wrong, buttotal
is finally correct:To Reproduce Steps to reproduce the behavior: You can do this directly against the EmailEngine API with something like Postman. This only appears to affect Gmail accounts.
EmailEngine version 2.43.0
Environment Just running it locally
Redis
Additional context This latest change to EmailEngine broke the ability to use the
page
value and forces using cursors for Gmail accounts, which is a breaking change and has definitely slowed me down unexpectedly. Thankfully my app is not in production yet. You even list it under "Bug Fixes" in the latest release but "Adding API support" for something is not a bug fix, that's a feature, and a breaking change in this case even though it was a minor release.