Currently the wrangle page is very basic. In particular, it assumes that all text documents need to be split up with the same patterns. This is often not the case -- it is not even true for my example datasets!
Instead, we need to present the user with an interactive hierarchy of the files they uploaded. They can select a whole folder and apply any splitting patterns to everything within that. They can also select individual files (and can select multiple files at the same time) and input the splitting patterns for them.
If a splitting patterns is applied to a whole folder, but a separate splitting pattern is applied to a file within that folder, the one applied to the file supersedes that applied to the folder.
If the user selects a single file, a few things should happen:
The current splitting patterns for that file are shown (e.g., if the user applied a pattern to a folder above it, show it here)
The user should be given an option to see the file. Request it from the server using AJAX and display it.
The user should be given an option to "preview" the wrangling for this file only. Send an AJAX request to the server, where it should be wrangled. BUT no need to store the results anywhere -- just send them back to the user. Display a summary of how many separate articles were found in the file, plus display them in a nice way.
Note that there are several string patterns which must be handled separately. For example, the user might put in an article splitter that applies to a whole folder, but want to specify start-of-article patterns that are different for each file within that folder.
Currently the wrangle page is very basic. In particular, it assumes that all text documents need to be split up with the same patterns. This is often not the case -- it is not even true for my example datasets!
Instead, we need to present the user with an interactive hierarchy of the files they uploaded. They can select a whole folder and apply any splitting patterns to everything within that. They can also select individual files (and can select multiple files at the same time) and input the splitting patterns for them.
If a splitting patterns is applied to a whole folder, but a separate splitting pattern is applied to a file within that folder, the one applied to the file supersedes that applied to the folder.
If the user selects a single file, a few things should happen:
Note that there are several string patterns which must be handled separately. For example, the user might put in an article splitter that applies to a whole folder, but want to specify start-of-article patterns that are different for each file within that folder.