Open anku255 opened 9 months ago
id
, type
etc.. We don't really have any examples of other similar jobs, and there is no point making it generic (to keep things simpler). Everything related to this will have its own table / cronjob / apis anway. So the APIs listed here can just be:
status
on a job level exactly? If the job is there in the db, then it's pending, else it's finished. If it had partially failed, how will the status help know what parts of it are done?job/:job_id
in the core. The core routing does not allow for placeholder in paths easily. So we just make it as part of the query param instead.main advantage of this approach lies in avoiding the addition of a CRON job
This is a very tiny advantage IMO. What other advantages are there?
@rishabhpoddar After discussions, we've opted for method 1. I've expanded the original to provide detailed elaboration on method 1.
Database Schema
Please give it in SQL format like CREATE TABLE ...
and specify any keys.
Request Body of API to post json
The totp object needs skew, period, deviceName (optional). Please see the API to import a device.
Size Limit: 10,000 users
The test you ran, what are the specs of the setup?
All possible errors for the /add-users endpoint
thirdPartyId must be one of 'apple', 'discord', 'facebook', 'github', 'google', 'linkedin' or 'twitter'.
This is not true.
Get bulk import users
Delete bulk import users by importId
Why do we need this API? Why can't we just auto clear the space in the db?
A cron job, ProcessBulkImportUsers, runs every minute to handle users added to the bulk_import_users table. It processes users (status = 'NEW') in batches of 1000, updating their status to 'PROCESSING' before processing and to 'PROCESSED' afterward. If processing fails, the error_msg column is updated with error messages.
The query in getUsersToProcess function
All possible errors for the Cron Job
Pseudo code for the cron job
Utilizing the Bulk Import API
@rishabhpoddar
Please give it in SQL format like CREATE TABLE ... and specify any keys.
Done.
The totp object needs skew, period, deviceName (optional). Please see the API to import a device.
Done.
The test you ran, what are the specs of the setup?
I ran the tests with my MacBook Pro which has pretty good specs but I was able to insert users till 50K as well. 10K is a conservative number that should be able to accommodate lower end specs too.
What if the user being imported already exists (either a user in ST, or as an entry in the bulk table)
We'll update the data, overwriting fields wherever applicable for simplicity. If a user is performing a bulk import while their previous auth provider is active, it's possible that user data could be updated for some users after creating the export file. Allowing users to update and create users in the same API would be helpful.
thirdPartyId must be one of 'apple', 'discord', 'facebook', 'github', 'google', 'linkedin' or 'twitter'. This is not true.
Can you please elaborate? Do we support every thirdparty?
Why do we need importid (seems like an overkill).
Upon some discussion we decided to remove it.
Does it work for mysql and postgresql? Is the FOR UPDATE necessary Yes it works in both (mysql v8 and above), postgresql (v12 and above).
The information on the atomicity of Common Table Expressions (WITH statement) is ambiguous. While some sources suggest potential issues where another query might select the same documents before they are updated, incorporating FOR UPDATE
provides an additional layer of safety.
I don't think this is a good idea cause the entry being processed could have other information like a different set of roles, in which case, would you overwrite the roles for the older user or merge them?
I have wrote more about this in the above point.
How is the user supposed to know about these errors and handle them? They would have to modify / delete that specific user perhaps to fix this issue.. How do they do that?
Good point. The API will provide the index of the user requiring attention. Our script can then handle the removal of users with errors, creating a separate file for them. Users with errors can be addressed manually later.
What happens in the cron if an error is encountered for a user? Does it move on to the next user or stop there? Yeah, the cron will continue processing other users. The user having error will be marked as FAILED and an error message will be set.
Why do users need to use just in time migration along with bulk migration? Sure, there might be new user sign ups during the bulk migration, but users can just add those later on to ST anyway?
Not only there could be new sign ups but existing user data can be updated as well. Users can call the API again with the updated data and make it work without "just in time" migration as well. This requires that bulk import API will update the data if the user already exists.
Please mention what the implementation of importUser is. I need to see that in a flow diagram clearly, in detail.
Please find a pseudo code for the importUser
function below. This is missing some things like userroles
but it should be enough to give you an idea about how I am thinking about this.
function ImportUsers(user) {
try {
Start Transaction;
let EPUserId = null;
let TPUserId = null;
let PWlessUserId = null;
let primaryUserId = null;
Set isPrimary to true for the first loginMethod if none of the methods have it set
if (isPrimary is not true for any user.loginMethods) {
user.loginMethods[0].isPrimary = true;
}
for (loginMethod in user.loginMethods) {
if (loginMethod.recipeId = 'emailpassword') {
EPUserId = CreateUserForEP(loginMethod, user.externalUserId);
if (loginMethod.isPrimary) {
primaryUserId = EPUserId;
}
}
if (loginMethod.recipeId = 'passwordless') {
PWlessUserId = CreateUserForPasswordless(loginMethod, user.externalUserId);
if (loginMethod.isPrimary) {
primaryUserId = PWlessUserId;
}
}
if (loginMethod.recipeId = 'thirdparty') {
TPUserId = CreateUserForThirdParty(loginMethod, user.externalUserId);
if (loginMethod.isPrimary) {
primaryUserId = TPUserId;
}
}
}
if (user.loginMethods.length > 1) {
Call accountlinking.createPrimaryUser for primaryUserId;
Call accountlinking.linkAccount for every userId other than primaryUser;
}
if (user.usermetadata) {
Update user metadata for the primaryUserId;
}
if (user.mfa) {
Call mfa related APIs for the primaryUserId;
}
Delete user from bulk_import_users table;
} catch(e) {
user.status = 'FAILED';
user.error_msg = e.message // Set meaningful error message
} finally {
End Transaction;
}
}
function CreateUserForEP(data, externalUserId) {
Validate data fields;
Call EmailPassword.importUserWithPasswordHash;
if (data.isPrimary) {
Link this user to externalUserId;
}
if (data.time_joined) {
Update time_joined of the created user;
}
if (data.isVerified) {
Update email verification status;
}
return newUserId;
}
function CreateUserForPasswordless(data, externalUserId) {
Validate data fields;
Call Passwordless.createCode;
Call Passwordless.consumeCode; // Email will be verified automatically here
if (data.isPrimary) {
Link this user to externalUserId;
}
if (data.time_joined) {
Update time_joined of the created user;
}
return newUserId;
}
function CreateUserForThirdParty(data, externalUserId) {
Validate data fields;
Call ThirdParty.SignInAndUp API; // Pass data.isVerfied
if (data.isPrimary) {
Link this user to externalUserId;
}
if (data.time_joined) {
Update time_joined of the created user;
}
return newUserId;
}
We'll update the data, overwriting fields wherever applicable for simplicity. If a user is performing a bulk import while their previous auth provider is active, it's possible that user data could be updated for some users after creating the export file. Allowing users to update and create users in the same API would be helpful.
We decided that we will not be updating existing users in the bulk migration logic. If we see a repeated user in bulk migration (same external user id, or same account info in same tenant), then we will mark it as an error
When a user generates a bulk import JSON file from an existing authentication provider, ongoing changes to user data can make the import outdated by the time it's completed. For example, if a user updates their password after the JSON file is created, they will not be able to log in with their new password on SuperTokens.
A straightforward solution is to perform a second bulk import after switching to SuperTokens. If the Bulk Import API supports updates, the new data can replace or add to what's in the SuperTokens database, helping to correct any inconsistencies from the initial import.
Delayed Consistency: Data inconsistencies will persist until the second import completes. This delay can lead to various issues, including users being unable to log in, especially if significant time elapses between the first and second imports.
Data Overwrite Risks: If user data in SuperTokens is updated directly between the two imports, a second import might inadvertently overwrite these newer updates with older data. This risk can be mitigated by checking the updated_at
timestamp to ensure data consistency.
LazyMigration
RecipeCreate a LazyMigration
recipe to seamlessly migrate users to SuperTokens upon their first login or signup. Here's a potential implementation:
recipeList = [
LazyMigration({
getUser: (email?: string, phoneNumber?: string, password?: string, thirdPartyInfo?: ThirdPartyInfo, isEmailVerified?: boolean): BulkImportJSON => {
// Fetch user data from the existing authentication provider
},
})
]
The process unfolds as follows:
Transition to SuperTokens: Integrate SuperTokens into your application and configure the LazyMigration
recipe. Start using SuperTokens
for authentication.
User Authentication: When an existing user logs in for the first time, the signUp
recipe in Supertokens will retries their data from your existing authentication provider using the LazyRecipe
recipe's getUser
function. The user is then imported into SuperTokens. New users can sign up directly with SuperTokens.
Bulk Import: Use Bulk Import to migrate all existing users to SuperTokens at once. Users previously migrated via the LazyMigration
recipe will be automatically skipped during this process.
Deactive existing auth provider: Upon completion of the bulk import, you can safely remove the existing authentication provider from your system.
You just follow our docs as is.
You just follow our docs as is, except for session verification, you do the following:
app.post("/verify", verifySession({sessionRequired: false}), async (req, res) => { let userId = undefined; if (req.session !== undefined) { // this means a supertokens session exists userId = req.session.getUserId(); } else { // try doing auth with existing auth provider userId = await yourExistingAuthVerify(req); }
if (userId === undefined) {
// send a 401
} else {
// Your API logic.
}
});
Talk about the migration API here and link to the JSON schema explanation in the "1) Creating the user JSON" page Also mention that if you call the API twice for the same user, it will not update the second time, and if users want to update some information, they will have to do so by calling individual APIs. For example, if you want to update the password hash of this user, call XYZ API, or if you want to update roles, call the roles PUT API (link to the CDI spec).
Start the SuperTokens core service connected to your database
Import the JSON users into SuperTokens:
Time / ram estimations for 1M users.
Once you upload the JSON, the core will run a cron and this rate etc etc..
Monitor based on the GET API
How do know when the migration is done?
How to know about errors and fix them?
How to know how many users are pending?
How to delete entries from the migration table?
Time / ram estimations for 1M users:
For now, let it be how it is in the docs
For now, let it be how it is in the docs
- Mention the problem that users may have updated their data whilst the migration is happening.
- This is an unsolved problem for now, but we will solve it by introducing a lazy migration recipe (coming soon).
- If you really don't want this problem, you should stop your existing auth provider before you start creating the user JSON. This will ofc add to downtime for your application.
- You have to keep your existing auth provider running for the amount of time you think enough users have signed in to your application.
- The rest of the users will have to be imported through bulk migration with random passwords and then they will have to go through the password reset flow.
https://docs.google.com/document/d/1TUrcIPbdsHfqheIkB6CTkTNiA6YN-73Xz_pX2kjXhkc/edit
Open PRs:
TODO:
[ ] Test with 1M users in CICD. Make sure users are generated in a way that have various login methods, tenancy, metadata, roles etc. Make sure the time the test takes is not too much, and things work well and are consistent.
[ ] Create an API for starting/stopping the cron job, and an other one for getting the status of the cron job (active/inactive). The processing batch size should be a parameter for the starting API.
[ ] Allow developers to configure parallelism in ProcessBulkImportUsers cron job
Currently, it takes on an average 66 seconds to process 1000 users. This is very slow if we are processing a large number of users. This happens because we loop through the users one by one in a for loop and also use just 1 DB connection for BulkImportProxyStorage.
The solution is to process users parallely using threads and create a BulkImportProxyStorage instance for each user (that is being processed). The number of users we will process in parallel will depend on the
BULK_MIGRATION_PARALLELISM
config value set by the user. This will be a SaaS protected prop and can be added toPROTECTED_CONFIGS
in CoreConfig.java. It should have the@NotConflictingInApp
annotation.[ ] PR changes in supertokens-core PR
[ ] PR changes in supertokens-postgresql-plugin PR
[ ] Changes in Node Script to add users
The script needs to re-written to optimise for the following user stories -
The user is not expected to monitor the script. The script should try to continue processing and retry failures wherever possible.
The user should be able to re-run the script using the same input file multiple times and the script should process the remaining users. This can be implemented by maintaining a state file per input file name.
The Core API calls should have an exponential backoff retries (unlimited) in case we get an error from the Core. This is to ensure that we don't halt processing in case the Core is down for a few seconds.
The script should continue showing the Bulk Import Cron Job status after it has added all the users. Any users with status=
FAILED
will be added to the sameusersHavingInvalidSchema.json
file. This file could also be renamed to be something likeusersHavingErrors.json
. Since, we will be writing to theusersHavingErrors.json
file, they are expected to wait until all the users have been processed and then fix the error file and add those users again.The script should display progress logs while adding the users. This could include total number of usrs, number of users added, number of users having errors, etc.
We also need to re-search about the size limit of the JSON file. A JSON file having a million user would be about 880 MB. JSON files cannot be streamed, the whole file needs to be read in memory. If this is an issue then we may need to switch to
ndjson
file format which allows streaming the file.[ ] Documentation for Bulk Import
[ ] Update the CDI spec to include
/bulk-import/import
and/bulk-import/users/count
APIs. Also update theBulkImportUser
schema to includeplainTextPassword
field.[ ] Bulk Import for Auth0 users (ignore for now) After the Bulk Import task is complete. We plan to have a special guide for Auth0 users.
Auth0 users need to request the exported file from their support team if they want password hashses of the users. For any other type of login, they can export the users themselves using the API. However, this API doesn't include user roles.
We could write a script that takes the exported JSON and Auth0 credentials. The script would get all the roles and map to the users. (We could also call the roles API for each user but that would take more API calls)
We can also add separate page in our documentation for Auth0 users that guides them about requesting the export JSON file and running the above script.
Order of changes: