It doesn't matter if you join our workshop live or you prefer to do at your own pace, we have you covered. In this repository, you'll find everything you need for this workshop:
Don't forget to complete your upgrade and get your verified skill badge! Finish and submit your homework!
That's it, you are done! Expect an email next week!
ASTRA
service is available at url https://astra.datastax.com. ASTRA
is the simplest way to run Cassandra with zero operations at all - just push the button and get your cluster. No credit card or any payment required, $25.00 USD credit every month, roughly 5M writes, 30M reads, 40GB storage monthly - sufficient to run small production workloads.
Github
, Google
accounts or register with an email
.Make sure to chose a password with minimum 8 characters, containing upper and lowercase letters, at least one number and special character
Follow this guide and use the values provided below, to set up a pay as you go database with a FREE $25 monthly credit.
Parameter | Value |
---|---|
Database name | sql_to_nosql_db |
Keyspace name | spring_petclinic |
Ok, now that you have a database created the next step is to create a tables to work with.
In the Summary screen for your database, select CQL Console from the top menu in the main window. This will take you to the CQL Console and automatically log you in.
Ok, now we're ready to rock. Creating tables is quite easy, but before we create one we need to tell the database which keyspace we are working with.
First, let's DESCRIBE all of the keyspaces that are in the database. This will give us a list of the available keyspaces.
š Command to execute
desc KEYSPACES;
"desc" is short for "describe", either is valid
š Expected output
Depending on your setup you might see a different set of keyspaces then in the image. The one we care about for now is _springpetclinic.
From here, execute the USE command with the _springpetclinic keyspace to tell the database our context is within _springpetclinic.
š Command to execute
use spring_petclinic;
š Expected output
Notice how the prompt displays token@cqlsh:spring_petclinic>
informing us we are using the _springpetclinic keyspace. Now we are ready to create our tables.
š Command to execute
use spring_petclinic;
DROP INDEX IF EXISTS petclinic_idx_vetname;
DROP INDEX IF EXISTS petclinic_idx_ownername;
DROP TABLE IF EXISTS petclinic_vet;
DROP TABLE IF EXISTS petclinic_vet_by_specialty;
DROP TABLE IF EXISTS petclinic_reference_lists;
DROP TABLE IF EXISTS petclinic_owner;
DROP TABLE IF EXISTS petclinic_pet_by_owner;
DROP TABLE IF EXISTS petclinic_visit_by_pet;
/** A vet can have multiple specialties. */
CREATE TABLE IF NOT EXISTS petclinic_vet (
id uuid,
first_name text,
last_name text,
specialties set<text>,
PRIMARY KEY ((id))
);
/** We could search veterinarian by their names. */
CREATE INDEX IF NOT EXISTS petclinic_idx_vetname ON petclinic_vet(last_name);
/** We may want to list all radiologists. */
CREATE TABLE IF NOT EXISTS petclinic_vet_by_specialty (
specialty text,
vet_id uuid,
first_name text,
last_name text,
PRIMARY KEY ((specialty), vet_id)
);
/**
* Here we want all values on a single node, avoiding full scan.
* We pick am unordered set to avoid duplication, list to be sorted at ui side.
*/
CREATE TABLE IF NOT EXISTS petclinic_reference_lists (
list_name text,
values set<text>,
PRIMARY KEY ((list_name))
);
/** Expecting a combobox list references all specialties. */
INSERT INTO petclinic_reference_lists(list_name, values)
VALUES ('vet_specialty', {'radiology', 'dentistry', 'surgery'});
CREATE TABLE IF NOT EXISTS petclinic_owner (
id uuid,
first_name text,
last_name text,
address text,
city text,
telephone text,
PRIMARY KEY ((id))
);
/** We could search veterinarians by their names. */
CREATE INDEX IF NOT EXISTS petclinic_idx_ownername ON petclinic_owner(last_name);
CREATE TABLE IF NOT EXISTS petclinic_pet_by_owner (
owner_id uuid,
pet_id uuid,
pet_type text,
name text,
birth_date date,
PRIMARY KEY ((owner_id), pet_id)
);
CREATE TABLE IF NOT EXISTS petclinic_visit_by_pet (
pet_id uuid,
visit_id uuid,
visit_date date,
description text,
PRIMARY KEY ((pet_id), visit_id)
);
INSERT INTO petclinic_reference_lists(list_name, values)
VALUES ('pet_type ', {'bird', 'cat', 'dog', 'lizard','hamster','snake'});
describe tables;
š Expected output
In order for you to securely connect to your Cassandra database on Astra DB you need to generate an application token. The cool thing once you generate this once you can then use it for any of your applications or tools to talk to your database.
If you don't already have one follow the instructions HERE to generate your new token. Don't forget to download it once created because you will not be able to see it again without generating a new one.
Once you DOWNLOAD the token if you view the contents they should look something like this:
"Client Id","Client Secret","Token","Role"
"fdsfdslKFdLFdslDFFDjf","aaaaaaadsdadasdasdasdfsadfldsjfldjdsaldjasljdasljdsaljdasljdasljdlasjdal-FLflirFdfl.lfjdfdsljfjdl+fdlffkdsslfd","AstraCS:ppppdspfdsdslfjsdlfjdlj:540524888-04384039399999999999999999","Admin User"
ā ļø Important
The instructor will show you on screen how to create a token but will have to destroy to token immediately for security reasons.
You'll need to use this in a moment to authenticate with DSBulk so keep it handy.
In order to use DSBulk you need to download and install it. While you can do this locally if you would like following the instructions HERE we've already provided it for you using GitPod. Click the button below to launch your instance.
We're going to need a secure connect bundle to talk to our Astra DB instance with an external app. The bundle contains all of the information about where our cluster is up on the cloud and how to securly connect. After you launched GitPod you may have noticed a prompt asking you for your Token
.
Now, you will see a prompt asking for your cluster ID.
Go back to the Astra DB UI dashboard screen and:
sql_to_nosql_db
databaseFinally, paste the DB ID into prompt in GitPod and hit ENTER
. That's it, you should have your bundle.
š Expected output
owner
table SQL export into petclinic_owner
NoSQL tableOk, we're going to use DSBulk in this section to:
astra-creds.zip
owner
table)UUID()
Once this command is constructed it should look something like this:
An example of how to construct the above DSBulk command can be found HERE.
We've made this a little easier by constructing the command for you. Just run the dsbulk.sh
script. This will ask for the CLIENT ID and CLIENT SECRET you created earlier. When it asks, just paste in your value and hit ENTER
to go to the next step.
š Command to execute
bash dsbulk.sh
š Expected output
Now, go back to CQL Console
in your Astra DB UI and view the data from the petclinic_owner
table.
š Command to execute
SELECT * FROM petclinic_owner;
š Expected output
Awesome! You now have data in your new NoSQL table.
So great, you just ran the DSBulk command and something happened, but lets explain this a bit more.
First thing, here is the source CSV we are using generated from our SQL relational database for the owner
table.
If you remember from our discussion and the slide deck when moving into something like Cassandra we don't tend to use INTegers for IDs. We use UUIDs instead. This is to ensure IDs are truly unique when using a distributed system. With that, we need to transform the INT based IDs from the SQL data to UUIDs. We do this by using the **UUID()**
function in our INSERT statement.
This will generate UUIDs for us as data is inserted into our new Cassandra table.
Also, notice the header line in our CSV.
If you look back to the INSERT statement we used you may have noticed something like
VALUES (:first_name,:last_name,:address,:city,:telephone,UUID())
where you see items :first_name
and :last_name
. These are bindings. They are binding the values passed into the INSERT statement to the each column by the name of the column. You can use this to map values all sorts of ways and this just scratches the surface, but I think you get the idea.
Whoohoo! Congrats on making it to the end. While this workshop just touches on this process from a pretty high level hopefully it gives you an idea of the kinds of things you need to do and where to start. To learn more about what you can do with DSBulk take a look at the docs HERE. Also, don't forget that you can use Astra DB to experiment and play around with your data model for FREE well within the limits of the $25 monthly credit.