GenieFramework / Stipple.jl

The reactive UI library for interactive data applications with pure Julia.
MIT License
322 stars 27 forks source link

Uploading csv to backend to process and display results #89

Open mdsa3d opened 2 years ago

mdsa3d commented 2 years ago

I would like to know how can i upload a csv file through stipple to the backend for processing? Currently, I am working with the example provided by @essenciary which can display a random dataframe to the frontend. However, I want client users to upload their own csv file.

Example Script:

using Genie, Stipple
using StippleUI
using DataFrames

dt = DataFrame(rand(10,2), ["x1", "x2"])
dt_opts = DataTableOptions(columns = [Column("x1"), Column("x2", align = :right)])
# WEB_TRANSPORT = Genie.WebThreads #for AJAX/HTTP

@reactive mutable struct APP <: ReactiveModel
  data::R{DataTable} = DataTable()
  data_pagination::DataTablePagination = DataTablePagination(rows_per_page=50)       
end

function handlers(model)

  on(model.isready) do isready
    isready || return 

    model.data[] = DataTable(dt, dt_opts)
  end

  model
end

function ui(model::APP)
  page(model, 
  [
      heading("Dashboard") 

      row([
          cell(class="st-module", [
              h4("Dataset")
              # add your table logic here: 
              table(:data;
              pagination=:data_pagination, 
              style="height: 350px;"
              #dense=true, flat=true, 
              )
          ])
      ])
  ])
end

route("/") do
  APP |> init |> handlers |> ui |> html
end
up(9000)

Thanks would highly appreciate the help! Look forward to the suggestions.

AbhimanyuAryan commented 2 years ago

For file uploads you can easily extend Stipple components with Genie(our full-stack framework): https://www.genieframework.com/docs/tutorials/Handling-File-Uploads.html

Please feel free to share your feedback about our docs. I'm working on genie framework documentation. Let me know if anything in docs is confusing/non-obvious or needs any kind of improvement.

You can check a Stipple/Genie file upload example here: https://github.com/AbhimanyuAryan/stipple-fastai/blob/main/DetectBears/routes.jl

I will move this project to Stipple Demos soon(needs more work haven't got anytime recently to finish it)

AbhimanyuAryan commented 2 years ago

also if you don't want Genie specific functionality. You can totally skip using Genie at top. Most of the rendering logic(html methods) is exported to Stipple from Genie.

In the example Adrian provided. You can remove using Genie from top. Just say using Stipple, StippleUI makes it cleaner(less code) :)

hhaensel commented 2 years ago

In this discussion you can find an example of uploading a file.

mdsa3d commented 2 years ago

Thanks @AbhimanyuAryan and @hhaensel for the suggestion and help, the recommendation has been quite helpful. Based on the shared posts I have prepared a script which can upload csv files to backend. Code:

using Genie, Stipple
using Genie.Requests
using StippleUI
using DataFrames

Genie.config.cors_headers["Access-Control-Allow-Origin"]  =  "*"
Genie.config.cors_headers["Access-Control-Allow-Headers"] = "Content-Type"
Genie.config.cors_headers["Access-Control-Allow-Methods"] = "GET,POST,PUT,DELETE,OPTIONS"
Genie.config.cors_allowed_origins = ["*"]

function get_storage_dir(name)
  try
    if Sys.iswindows()
      mkdir("$(homedir())\\Desktop\\$name")
    elseif Sys.islinux()
      mkdir("$(homedir())/$name")
    end 
  catch
    @warn "directory already exists"
    if Sys.iswindows()
      dir_path = "$(homedir())\\Desktop\\$name"
    elseif Sys.islinux()
      dir_path = "$(homedir())/$name"
    end 
    return dir_path
  end
end

const FILE_PATH = get_storage_dir("UploadFolder")

dt = DataFrame(rand(10,2), ["x1", "x2"])
dt_opts = DataTableOptions(columns = [Column("x1"), Column("x2", align = :right)])

# WEB_TRANSPORT = Genie.WebThreads #for AJAX/HTTP

@reactive mutable struct APP <: ReactiveModel
  data::R{DataTable} = DataTable()
  data_pagination::DataTablePagination = DataTablePagination(rows_per_page=50)       
end

function handlers(model)

  on(model.isready) do isready
    isready || return 

    model.data[] = DataTable(dt, dt_opts)
  end

  model
end

function ui(model::APP)
  page(model, 
  [
      heading("Dashboard") 

      row([
        Html.div(class="col-md-12", [
          uploader(label="Upload Dataset", :auto__upload, :multiple, method="POST",
          url="http://localhost:9000/", field__name="csv_file")
        ])
      ])

      row([
          cell(class="st-module", [
              h4("Dataset")
              # add your table logic here: 
              table(:data;
              pagination=:data_pagination, 
              style="height: 350px;"
              #dense=true, flat=true, 
              )
          ])
      ])
  ])
end

route("/") do
  APP |> init |> handlers |> ui |> html
end

#uploading csv files to the backend server
route("/", method = POST) do
  files = Genie.Requests.filespayload()
  for f in files
      write(joinpath(FILE_PATH, f[2].name), f[2].data)
      @info "Uploading: " * f[2].name
  end
  if length(files) == 0
      @info "No file uploaded"
  end
  return "upload done"
end

up(9000)

Hopefully, this is helpful for others. But do suggest any modifications to make this code better and fast. Thanks again for the help!

mdsa3d commented 2 years ago

I would like to ask a following question, while processing the csv file at the backend I will convert it into DataFrame Object. And from this object data and data_opts are generated to render the table in front end. My query is that, how can i automate the data_opts for each csv file ? As with each csv_file the number of columns and name will change. Would highly appreciate any suggestions!

essenciary commented 2 years ago

What flow do you have in mind? "Empty" page, then users upload a CSV and then display it? That's relatively easy.

1/ The globals dt and dt_opts are quite bad for performance (and as a general approach) and you don't really need them - so you can remove that an the on(isready) code because you're setting some random data there which is useless.

2/ inside the handler of route("/", method = POST) you need to put the update code, ex:

# code to process the uploaded CSV
model.data[] = DataTable(DataFrame_from_CSV, DataTableOptions(whateve_options_makes_sense_for_the_data))

However - in the route you don't have access to that model instance that corresponds to the user making the request so you can't update the user's frontend. Which means that you need to have a reference to the user's channel (the connection id), pull the corresponding model and update it to send data to that specific user. The best way to do it is to drop a session cookie on the first request (when the page is loaded) and store the channel onto the cookie. Then, with the POST request you have access to the cookie so you can retrieve the channel from the cookie, pull the model and update it.

I know it sounds intimidating and it's a pretty complex workflow, but it's just a few lines of code really. I have an unpublished demo, I'll look for it.

As a similar approach, you can use a session_id as part of the URL, like in this example: https://github.com/GenieFramework/StippleDemos/blob/bc6f18451e1493216e0538efd7ceb2493b6eebb7/AdvanceExamples/MultiUserApp.jl (no need for cookies).


If however you want to allow the user to upload any number of CSV files on the same page, that's gonna be more complicated as you have to dynamically output extra tables and that's trickier.


Other code comments: the get_storage_dir function is overkill. If you need to write OS specific code then surely something is forced there. There's Julia provided homedir which gives you the user's home dir in a cross platform manner - but in general that's a bad idea. Just store the files with the app. Make an uploads dir in the root of the app and just say "uploads/".

essenciary commented 2 years ago

Here is an example that uses sessions to persist user data:

using Stipple, StippleUI
using Random, Genie.Sessions

Sessions.init()
@reactive! mutable struct Name <: ReactiveModel
  name::R{String} = ""
end

function ui(model)
  on(model.isready) do _
    model.name[] = Sessions.get!(:name, "")
  end

  on(model.name) do val
    Sessions.set!(:name, val) |> Sessions.persist
  end

  [
    page(model, title="Hello Stipple", [
      h1([
        "Hello, "
        span([], @text(:name))
      ])

      p([
        "What is your name? "
        input("", placeholder="Type your name", @bind(:name))
      ])
    ], @iif(:isready))
  ]
end

route("/") do
  init(Name) |> ui |> html
end
essenciary commented 2 years ago

@hhaensel I think we need to implement some form of ModelStorage layer to make it easy to persist and retrieve models - I'll work on it.

hhaensel commented 2 years ago

I agree. I'm still wondering, what a good appproach could be. I have in my mind something like

ModelDict = Dict{String, ReactiveModel}
ModelStorage = Dict{Symbol, ModelDict}

function push!(d::ModelStorage, modelpair::Pair{String, T}) where T <: ReactiveModel
    haskey(d, Symbol(T)) || push!(d, Symbol(T) => ModelDict())
    push!(d[Symbol(T)], modelpair)
end

const MODELSTORE = ModelStorage()

push!(MODELSTORE, "user1" => model)

We might consider using the more performant Dictionaries.jl instead of Dicts.

essenciary commented 2 years ago

I'm thinking of the following features: 1/ session management by default - we need a way to associate the current request with a model instance. Most likely by storing the channel onto a session cookie. 2/ model storage - ability to keep the model itself in RAM, indexed by channel. Here we should also provide the option to serialize the model instead of keeping it live in RAM and automatically deserializing it. If serialization and deserialization works well out of the box, it should most likely be the default (to avoid memory leaks).

3/ model data storage - when storing the actual model is not necessary (and most times I expect it won't be, leading to improved RAM usage), we can associate the fields and the data with the channel and store this data JSON serialized on the session.

Part of this I also want to look at: a) memory leaks - are the closed socket connections keeping a reference to the models, blocking garbage collection? b) automatic purging of closed WebSockets clients

mdsa3d commented 2 years ago

@hhaensel If I may suggest, Serialization.jl combined with struct objects could be great option in to keep the models small and performant. In addition, this could also allow users (app owners) to store the sessions (haha could be bad practice but just putting it put there).

hhaensel commented 2 years ago

There is also BSON, which stores binary Julia data.

hhaensel commented 2 years ago

Just some more thoughts to allow for more than one tab. I think we need to

If we don't separate model and channel, we end up with two tabs displaying the same view.

@essenciary would you agree?

montyvesselinov commented 2 years ago

It will be great if there is an example demonstrating how you can upload and process a CSV file.

For example, it will be nice to show you can upload a custom CSV file to perform k-means clustering using the example https://github.com/GenieFramework/StippleDemos/tree/master/IrisClustering but uploading . The CSV file should be uploaded using for example https://github.com/GenieFramework/StippleDemos/tree/master/BasicExamples/CsvUpload

essenciary commented 1 year ago

@PGimenez please review and close as you see fit