maybe-finance / maybe

The OS for your personal finances
https://maybe.co
GNU Affero General Public License v3.0
28.84k stars 2.19k forks source link

Enable File Upload for CSV transaction imports #799

Closed zachgoll closed 5 days ago

zachgoll commented 2 months ago

Image

Feature Overview

This is a follow up feature to the CSV imports that were already implemented in #708

In the original feature, only "copy/paste" was enabled. This issue's purpose is to add the ability for a user to directly upload a File.

Requirements

If there is a missing / incorrect requirement, please leave a comment before starting work on this.

Implementation Suggestions

Below are some ideas for implementation to get you started. Use your best judgment here—if there's a better way to do things, go for it!

Designs

Below are the designs you should follow while implementing this:

https://www.figma.com/design/lonJmVk3HYkwZoIO7xYP2w/Maybe-App-(Community)?node-id=3656-18086&t=t84SVrxXV2CSiCM7-0

Reminders

joehoyle commented 1 month ago

Related: when importing a CSV it seems there's no option to skip category / tags, i.e. select "no" field to map those to.

zachgoll commented 1 month ago

@joehoyle yeah I had run into this with a few of my own imports too. Created an issue here that we can cover this in:

https://github.com/orgs/maybe-finance/projects/14/views/1?pane=issue&itemId=65794196

joehoyle commented 1 month ago

Just another note: I tried to import CSV of about 6k lines. I was able to paste it into the field and import, but that basically hung the docker container for over an hour, and the import never completed, so it seems maybe the CSV imported has performance issues.

zachgoll commented 1 month ago

@joehoyle any chance you have some logs you could share?

The import happens in a background job, so can you check to make sure you have GOOD_JOB_EXECUTION_MODE: async set in your environment?

joehoyle commented 1 month ago

@zachgoll I took a further look into this, it's actually getting stuck on rendering clean.html.rb specifically trying to render each row as that uses ActionView::Helpers::FormBuilder per item. I tried removing the below from the template, to try unblock it:

<%= form.fields_for :csv_update do |ff| %>
                <%= ff.hidden_field :row_idx, value: row_index %>
                <%= ff.hidden_field :col_idx, value: col_index %>
                <%= ff.text_field :value, value: value,
                                  id: "cell-#{row_index}-#{col_index}",
                                  class: "#{@import.csv.cell_valid?(row_index, col_index) ? "focus:border-transparent border-transparent" : "border-red-500"} border px-4 py-3.5 text-sm w-full bg-transparent focus:ring-gray-900 focus:ring-2 focus-visible:outline-none #{table_corner_class(row_index, col_index, @import.csv.table, row.fields)}",
                                  data: { "auto-submit-form-target" => "auto" } %>
              <% end %>

Even without the above fields, it takes 30 seconds and is 16MB of HTML for the /clean endpoint. I think this might be due to the fact there is a full form per import item, which takes time I guess for building the form, and then there's a significant amount of HTML over the wire per item too.

I wonder if there's more than 200 rows, it should just cut it off by 200 and say "...X more". I can't imagine anyone would want to manually correct more rows than that, and it would be enough to scan that the import looks correct.

Either that, of vastly improve the efficiency of the /clean page output and the markup as I don't think the Browser is goign to be wild about rendering 6k forms either.

joehoyle commented 1 month ago

FYI the /confirm screen renders just fine with this many items, and the import job its self took like 10 seconds, so great stuff there.

MagnusHJensen commented 2 weeks ago

Based on the provided figma file here, I found some differences from the current implemented flow.

Button is not disabled when input is empty

Current

image

Figma:

image

No input validation is done

Current

image

Figma:

image

Help table below has differences (Which one is right? Most likely app)

I'm aware the Figma is most likely wrong here, then maybe a task to get it updated, to remove as many inconsistencies across the board as possible.

However no template file is provided

Current

image

Figma:

image

I propose we break this issue into at least two sub-issues, one for doing the frontend validation through a stimulus controller, a generic one can be made for doing validation checks.

A second sub-issue which handles validating the CSV format on the backend, and renders the page back with the error as stated in figma, so we only do the error validation of the format on the backend. This would include 2 header validations:

On a side note, you can currently do this, which results in the next two steps (clean and confirm) being empty, and the final step (confirm) saying Import 0 transactions

MagnusHJensen commented 2 weeks ago

I may have built this just to get the ropes of rails, ruby and the hotwired stuff 😅

zachgoll commented 2 weeks ago

@MagnusHJensen thanks for going through this! I'd agree that there are a few validation improvements that we could make, and you're welcome to break those off into a separate issue.

In regards to the mismatch between app and designs, some of this was intentional. Anything that would require an additional Stimulus controller has been intentionally omitted for simplicity, and the example data table is correct in the app.

So if we're going to open an issue for validation improvements, let's keep them all server side:

Eventually we'll get some client-side validation via Stimulus controllers added, but for now, we're just trying to keep things as minimal and simple as possible to speed up development and nail down the core flows and requirements.

MagnusHJensen commented 2 weeks ago

@MagnusHJensen thanks for going through this! I'd agree that there are a few validation improvements that we could make, and you're welcome to break those off into a separate issue.

In regards to the mismatch between app and designs, some of this was intentional. Anything that would require an additional Stimulus controller has been intentionally omitted for simplicity, and the example data table is correct in the app.

So if we're going to open an issue for validation improvements, let's keep them all server side:

  • Validate for at least 1 row of data
  • Headers validation

Eventually we'll get some client-side validation via Stimulus controllers added, but for now, we're just trying to keep things as minimal and simple as possible to speed up development and nail down the core flows and requirements.

Sounds great, I'll split that off and make it into an issue, not sure if I'll pick that up specifically since the validation part is a bit hard to navigate, and a bunch of methods exists that should get header validation etc. etc, but it does not seem to work as expected.

That being said, would you be open to a small PR with 3 file changes, that add a VERY simple frontend validation stimulus controller, not expected to be implemented elsewhere right now, but is a starting point in the future?

You can get a snippet here:

import { Controller } from "@hotwired/stimulus";

// Connects to data-controller="validation"
export default class extends Controller {
  static classes = ["invalid"];
  static targets = ["nonEmpty", "submit"];
  static values = { isValid: { type: Boolean, default: false } };

  isValidValueChanged() {
    if (!this.hasSubmitTarget) return;

    if (this.isValidValue) {
      this._makeSubmitValid();
    } else {
      this._makeSubmitInvalid();
    }
  }

  submitTargetConnected() {
    if (this.isValidValue) {
      this._makeSubmitValid();
    } else {
      this._makeSubmitInvalid();
    }
  }

  validate() {
    // Add the validation targets and their respective logic here.
    for (const target of this.nonEmptyTargets) {
      if (!target || !target.value || target.value === "") {
        this.isValidValue = false;
        return;
      }
    }

    this.isValidValue = true;
  }

  _makeSubmitValid() {
    this.element.classList.remove(this.invalidClass);
    this.submitTarget.disabled = false;
  }

  _makeSubmitInvalid() {
    this.element.classList.add(this.invalidClass);
    this.submitTarget.disabled = true;
  }
}

The above controller could be expanded to cover more validation types in the future, and have the same controller used in multiple places.

Also do you think the upload specs are refined enough for a PR to be opened? As stated I started some work while trying to get more familiar with the codebase :D

zachgoll commented 1 week ago

@MagnusHJensen sorry, had not seen this when originally posted. In regards to the validation of headers, I think we can rely on the native HTML required validation in that second step where we map the headers.

By that time, if the user has not provided the correct headers, they will realize that there is no valid column to select and will have to return to the previous step and update their CSV.

While a controller like #977 doesn't add too much complexity at first, my worry is more in the fact that we're introducing extra lines of code without a pressing reason to do so. As it stands right now, if a user tries to submit a blank CSV in the first step, they will see this:

CleanShot 2024-07-12 at 10 12 09@2x

While I'd agree that adding in a Stimulus controller to toggle the style of the submit button does create a marginally better experience, it's one of those tradeoffs that's probably worth making early on in the project to lean towards simplicity.