Trying to get started with edx2bigquery and wondering about a few things in the README
Location
Is the implication that edx2bigquery should be installed and run on a compute engine instance, or is it meant to be run locally (e.g. on my Mac OS system)?
Data directories
What is the nature of the files that should exist in the COURSE_SQL_BASE_DIR and COURSE_SQL_DATE_DIR? How are these two paths related to each other? (Looking at the find_course_sql_dir method it looks like COURSE_SQL_BASE_DIR holds many subdirectories, one for each course, and then within each of those will be many directories holding files for a certain date?
Is the convention for course directory names supposed to have double undescores or dashes between parts of the name?
Does the date directory hold cumulative data or only from the previous date present in the DATE_DIR?
Are the actual sql files just a sql dump of the edxapp database from our production MySQL database?
edx api
Looks like some of the code requires edxapi module, but I don't see any requirement for edx-api-client ... is that something I have to pip install manually and then set via instructions in that repo?
Thanks for any thoughts!
Also wondering if there are there any good tutorials or other documentation by the team on setting up edx2bigquery...a quick search didn't turn up anything but still hoping :-)
@danielmcquillen I am also having a lot of trouble, any help would be appreciated. Did you find where you sql files were located for example? (ie: COURSE_SQL_BASE_DIR etc..)
Trying to get started with edx2bigquery and wondering about a few things in the README
Location Is the implication that edx2bigquery should be installed and run on a compute engine instance, or is it meant to be run locally (e.g. on my Mac OS system)?
Data directories What is the nature of the files that should exist in the
COURSE_SQL_BASE_DIR
andCOURSE_SQL_DATE_DIR
? How are these two paths related to each other? (Looking at thefind_course_sql_dir
method it looks likeCOURSE_SQL_BASE_DIR
holds many subdirectories, one for each course, and then within each of those will be many directories holding files for a certain date?edxapp
database from our production MySQL database?edx api Looks like some of the code requires edxapi module, but I don't see any requirement for
edx-api-client
... is that something I have to pip install manually and then set via instructions in that repo?Thanks for any thoughts!
Also wondering if there are there any good tutorials or other documentation by the team on setting up edx2bigquery...a quick search didn't turn up anything but still hoping :-)