bedatadriven / activityinfo-R

ActivityInfo R Language Client
https://www.activityinfo.org/support/docs/R/
17 stars 12 forks source link

improve `importTable()` and process of uploading a new table of data into ActivityInfo #44

Closed Ryo-N7 closed 1 year ago

Ryo-N7 commented 1 year ago

currently the process of:

alongside the changes to improve how we manipulate the form/database schemas in issue #32 and #31 we should think of ways to streamline this process

Ryo-N7 commented 1 year ago

importTable() needs a reader for NARRATIVE long-form text field types

Getting form c5e73092971 schema returned with status 200: success Error in prepareImport(schema$elements[[fieldIndex]], columnName, data[[columnName]]) : Field 'Descripcion' has unsupported type 'NARRATIVE'

Ryo-N7 commented 1 year ago

for importTable() would like to be given the Job ID so then the user can look up the server error themself (given proper permissions etc.)

Getting form c736c52093c schema returned with status 200: success POST request to https://www.activityinfo.org/resources/jobs returned with status 200: success GET request to https://www.activityinfo.org/resources/jobs/ahBlfmFjdGl2aXR5aW5mb2V1chALEgNKb2IYgICA7beLpgsM returned with status 200: success Waiting for importRecords job to complete: 0% GET request to https://www.activityinfo.org/resources/jobs/ahBlfmFjdGl2aXR5aW5mb2V1chALEgNKb2IYgICA7beLpgsM returned with status 200: success Waiting for importRecords job to complete: 0% GET request to https://www.activityinfo.org/resources/jobs/ahBlfmFjdGl2aXR5aW5mb2V1chALEgNKb2IYgICA7beLpgsM returned with status 200: success Waiting for importRecords job to complete: 0% Error in executeJob("importRecords", descriptor = list(formId = formId, : Job failed. Code: SERVER_ERROR, Message: Server error

Ryo-N7 commented 1 year ago

maybe also a way to pick/choose the correct column label (or specify it with the field code when importTable() gives you an error because of ambiguous field labels?

Ryo-N7 commented 1 year ago

ideal pain-free process:

  1. get table/form made in Excel or whatever can be ingested
  2. find out the field names + field types of each column
  3. using that info create a activityinfo compliant schema based off of the Excel/whatever file table given (schema helper functions)
  4. upload that new schema as a new form into ActivityInfo db (addForm() function) --- what does addForm() return currently??
  5. With that completed, it'll give me back the ID of the newly created form
  6. using form ID i can use add/updateRecord() or even importTable() to fill in the data values
nickdickinson commented 1 year ago

So I think now you can do item 2, 3 and 4 with the form field schema functions.

For example:

fmSchm <- formSchema(databaseId = databaseId, label = paste0("R form with multiple fields test ", cuid()))

fmSchm <- fmSchm |> 
  addFormField(multilineFieldSchema(label = "A narrative")) |>
  addFormField(textFieldSchema(label = "A text field")) |>
  addFormField(quantityFieldSchema(label = "A water quantity field", unit = "litres per day"))

dbMetadata <- addForm(databaseId = databaseId, schema = fmSchm)

fmSchm2 <- getFormSchema(formId = fmSchm$id)

From that moment, you should be able to import your data using as in item 6. You can try it if you run from https://github.com/bedatadriven/activityinfo-R/tree/version-4.32

I'll try to make a vignette with a reproducible example before Wednesday.

ideal pain-free process:

  1. get table/form made in Excel or whatever can be ingested
  2. find out the field names + field types of each column
  3. using that info create a activityinfo compliant schema based off of the Excel/whatever file table given (schema helper functions)
  4. upload that new schema as a new form into ActivityInfo db (addForm() function) --- what does addForm() return currently??
  5. With that completed, it'll give me back the ID of the newly created form
  6. using form ID i can use add/updateRecord() or even importTable() to fill in the data values
Ryo-N7 commented 1 year ago
databaseId <- "c5xypudl7oni7c52"

fmSchm <- formSchema(databaseId = databaseId, label = paste0("newform_yay", cuid()))

fmSchm <- fmSchm %>% 
  addFormField(multilineFieldSchema(label = "A narrative")) %>% 
  addFormField(textFieldSchema(label = "A text field")) %>% 
  addFormField(quantityFieldSchema(label = "A water quantity field", unit = "litres per day")) %>% 
  addFormField(userFieldSchema(label = "THIS_DUDE_RIGHT_HERE", databaseId = databaseId)) %>% 
  addFormField(singleSelectFieldSchema(label = "select from one of these options", 
                                       values = list("jeff", "goldblum", "tevez", "horatio", "jimothy")))

in general the error messages seem way to vague but i'm not sure how to improve it from the API's p.o.v. since everything underneath is really a JSON related message

Ryo-N7 commented 1 year ago

documentation for ALL of the field schema function in one place makes sense.

the formatting could be improved though because with everything in one giant 'Description' section only separated by new lines, it's a bit hard to tell which paragraph begins and ends for a particular field. I think just having them as their own bullet points would help distinguish each a bit more

Ryo-N7 commented 1 year ago

this is a more general point re: documentation: the documentation should try to further explain ActivityInfo API concepts. like in the field schema docs, in Arguments it talks about presentation: Default is "automatic". but it doesn't give me any info on what presentation is. I'm sure this is all stuff that's just copy-pasted from the API docs but a bit more elaboration would help users.

Ryo-N7 commented 1 year ago

importTable() bug? whenever I run the function without parentIdColumn I get the message >> argument 'parentIdColumn is missing, but function still works without it

still works despite no default stated so need to fix function parameters in the package?

function (formId, data, recordIdColumn, parentIdColumn) 
{
  parentId <- NULL
  schema <- activityinfo::getFormSchema(formId)
  schemaTable <- as.data.frame(schema)
  subform <- !is.null(schema$parentFormId)
  providedCols <- names(data)
  if (!missing(recordIdColumn)) {
    recordId <- recordIdFromData(data, recordIdColumn)
    providedCols <- providedCols[providedCols != recordIdColumn]
  }
  else {
    recordId <- rep.int(NA_character_, times = nrow(data))
  }
  if (subform) {
    parentId <- parentIdFromData(data, parentIdColumn, schema)
    providedCols <- providedCols[providedCols != parentIdColumn]
  }
nickdickinson commented 1 year ago

Some quick responses:

I agree it is a bit strange to add the databaseId but I think it is also the simplest solution at the moment.

akbertram commented 1 year ago

Note to self: also needs to work if a recordId is not provided.

nickdickinson commented 1 year ago

recordId bug: https://github.com/bedatadriven/activityinfo-R/issues/44#issuecomment-1422207640

nickdickinson commented 1 year ago

While implementing migrateField() see if bug still exists (provide recordIds).

Ryo-N7 commented 1 year ago

I'm getting a bug with importTable() where if i try to import data into a field that is single-select with values that are numeric/integer then the function bugs out and causes my session to crash

ex.

i have a "Fiscal Year" column that's set to numeric/integer in Excel/Access/data.frame/etc. but the form schema in activityinfo has it set as a single-select column with the options "2018", "2019", etc...

so when i try to push data from R with the "Fiscal Year" column with "2018" , "2019" ,etc. that are set as numeric/integer then my session crashes. The way to solve this problem is to simple change "Fiscal Year" with as.character() and then run importTable()

but yeah it's a weird issue i've found, i'll post screenshots and a bit more detail tomorrow if i can replicate it elsewhere