UTHSCSA-CIRD / simSQL

Script for populating a database with simulated data based on table definitions (and eventually custom datatypes specified in config files).
GNU Lesser General Public License v2.1
0 stars 1 forks source link

Use zip(), subsetting, indexing, and string manipulation to finish the simTable() command #3

Open bokov opened 10 years ago

bokov commented 10 years ago

There is a more standard and beautiful solution than iterating over an object inside square brackets like we discussed last week.

Some of the sub-items contain spoilers, so read and attempt each one before reading the next.

  1. Read this: https://docs.python.org/2/library/functions.html#zip . Try playing with the zip() function in the Python shell until you understand what it does.
  2. Note that by line 121 of simSQL2.py there is a dictionary object called tabletemp.
    1. Now, ask yourself, is each value in tabletemp a column of data or a row of data? How can you tell?
    2. Do SQL insert statements insert rows or columns?
    3. Do you see now why I asked you to learn the zip() function? Try incorporating it after line 121 of simSQL2.py
  3. Are you having problems extracting data in the correct format from tabletemp?
    1. How do you get just the values (not the keys) from tabletemp? Review the Python documentation for dictionaries if you need to.
    2. It might not be directly stated in the docs, but for our purposes, zip() expects multiple lists as arguments. In Python, there is a way to unpack an iterable object into multiple function arguments. Read about it here: https://agiliq.com/blog/2012/06/understanding-args-and-kwargs/ . This will tell you more than you need to know for this ticket, but you will need this information for a future ticket, so it's all going to be useful.
    3. Now it should be possible for you to turn tabletemp into a collection of rows.
  4. Now, how would you actually insert each row into the internal database? For starters, you need to end up with a text string for this table that looks like this: insert into FOOTABLE(COLA,COLB,COLC,...) values(?,?,?,...) where the COLX etc. should be replaced by column names for that table and FOOTABLE by the tableName variable. The ,... indicates that the number of variables can vary (there will be as many variables as there are columns, which is the same as saying that there will be as many variables as the length of each row you created above). The ?s should remain question-marks, but there should be as many of them as there are variables. Modify simTable() to print that string instead of printing tabletemp.
    1. What operator is used to concatenate strings in Python? Hint: looks like an arithmetical operator.
    2. Can you use this to concatenate a literal string (e.g. "insert into ") to a string that is stored in a variable (e.g. tableTemp)?
    3. Remember, the names of the columns become the names of the keys in the tabletemp dictionary. What method of the dictionary object do you use to get those keys as a list? Can this list be concatenated directly to a string literal?
    4. If you have a list of strings, you can mash them together with a delimiting string of your choice using the .join() method. Google it to understand how to use it. Do you see where it will be needed for constructing your insert string?
    5. Above we use an arithmetical operator that in Python is overloaded to permit string concatenation. There is another one, *, that allows you to replicate strings. For example 'Z3'*5 returns 'Z3Z3Z3Z3Z3'. Do you see where it will be needed for constructing your insert string?
    6. But ugh, you have an extra trailing character at the end. Can you think of a way to remove it?
    7. Okay, that probably works, but it's ugly. Interesting fact, strings in Python can use the same slice subscripting syntax as lists. Here is a really good explanation of slice notation which you should read. Can you see how you can remove the trailing comma using slice notation?
  5. Now you have an insert command, generated dynamically for each table (and it remains the same for each column in the table), with placeholders (that's what those ? are). You can use cur.execute(FOO,BAR) where FOO is your insert string (or a variable you assigned it to) and BAR is a list object of the same length as the number of ? placeholders. Try actually running this in the Python debugger (the second breakpoint, within the body of simTable() .
  6. Modify simTable() so that it iterates over all the rows you created using zip() and uses each one as the second argument for cur.execute() as above.
bokov commented 10 years ago

Test of comment

jtchien0925 commented 10 years ago

?Hello Alex,

How are you doing Alex? I hope you all doing well on everything especially your the other degree things. Finger crossed.

I'm back to Texas. I took a look of code of simSQL2.py ... I cant solve the issue of line 137 would you please help me to take a look when you have time?

Thank you I really appreciate that.

Bob


From: bokov notifications@github.com Sent: Sunday, June 15, 2014 9:19 PM To: UTHSCSA-CIRD/simSQL Cc: Chien, Jung-Ting Subject: Re: [simSQL] Use zip(), subsetting, indexing, and string manipulation to finish the simTable() command (#3)

Test of comment

Reply to this email directly or view it on GitHubhttps://github.com/UTHSCSA-CIRD/simSQL/issues/3#issuecomment-46128277.