Closed rijkvanzanten closed 5 years ago
Absolutely. Being a new feature, let's make sure all the Core/existing features are built first – but then I think this would be an amazing addition to Directus. I've spoken with many clients that cite importing as a big need... and exporting really goes a long way towards our view of data portability.
Lets also include support for exporting / importing to JSON. That makes it also pretty easy to maintain the default schema for installation. I was thinking about the following structure:
{
"<table>": [
{ "<field>": "<value>" },
{ "<field>": "<value>" }
]
}
For example, the Directus installation schema / start-data could look like:
{
"directus_fields": [
{
"id": 1,
"collection": "directus_users",
"field": "avatar",
"type": "FILE",
"interface": "single-file",
"options": null,
"locked": 1,
"etc": "..."
}
]
}
NOTE: This is only for values, not the schema itself
Interesting idea. SQL should come first since it's "native" to our storage... but JSON is a nice to have format too! And easier to read and create manually.
Preferably it supports SQL (dump), JSON, and CSV formats.
Editing large datasets in Excel / Numbers and then importing them into Directus would be extremely powerful.
I'd love something like this
importing and exporting database schema will make Directus expansion fast. think of a database schema for accounting, erp or maybe school management etc. and also this will enhance your marketplace.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
To achieve better clarity/visibility, we are now tracking feature requests within the Feature Request project board.
This issue being closed does not mean it's not being considered.
I have existing custom CSV import and export endpoints, as most of our clients using Directus have spreadsheets of hundreds of rows of data and also want to export data as CSV to give to accountants and managers and such. Could be on the pipeline soon if I can merge the custom with the core.
https://cdn.philleepflorence.com/uploads/00000000137.jpg is what it looks like in v6.x
If we can define the MVP features/options I can draft a design for this. Then we can look into mapping what you come up with into Core (or as an included extension). Thanks @philleepflorence!
Okay, will do!
We will try to support below mention types.
Different "types" of import/export?:
- Schema only Just import/export your table setup & corresponding records in
directus_*
tables- Data only Just import/export the data
- Everything A full database dump. Useful for backups & migrating between servers / hosted-local
I am planning to use db-dumper for export to data. This package support MySQL
and PostgreSQL
I didn't find any well-mentained package for DB import. On the current stage,
I am planning to use below command for MySQL
mysql -u <username> -p<PlainPassword> <databasename> < <filename.sql>
For PostgreSQL
psql -h localhost -U user -W -d database_name -f path/to/file.sql
P.S. Still looking for some better solution.
Just adding one more thought here:
A big ask from Enterprise customers has been migrating between environments. So I think it would be nice if our export option had the ability to dump different things:
directus_users
, etc)I guess as long as we can define different "outputs" for export to create, then we should be fine.
Another thought: we can't rely on SQL imports for Cloud. Once could inject custom SQL in there and screw up everything. (Which also means we need to have a way to globally disable imports from the config file)
Are you recommending we use an alternate format, such as JSON? That will likely be far more difficult to build and will make importing relatively proprietary. It's nice keeping with pure SQL whenever possible... though I recognize that it has security implications if not done correctly.
For Cloud we could make Import an Enterprise-only feature. Would that help since they are isolated?
I think it's most flexible / secure to have a dedicated structure for Directus. It allows us to choose what to export / import and it allows us to do some validation on the schema before injecting it. A full database dump can be as easily done directly from the database. The real value that Directus can add here is based on the idea that Directus is able to read and manipulate the to-be-imported data
One other point: we might want to consider making this a "module" extension so it's outside of the core logic. There might be many different ways to accomplish this feature... and an extension gives us flexibility and avoids potential breaking changes in the future.
Some notes: for importing into a shared environment, the import can be allowed only if the user is already in the DB and is an admin. I can share some logic on this. The only thing we do not allow is fresh import by an anonymous user. If the user is an admin and authenticated there are ways to only import into the same DB and/or "project".
Making it an extension is definitely more efficient and safer than adding to the core.
@philleepflorence As you suggested, the /import
and /export
endpoints are admin only access similar to the setting.
I wanted to recommend an approach based on experiences with some other systems.
Primary method of import/export is done with CSV files. CSV works very well because it is well supported by many applications/services and the data can be utilized by anyone with a spreadsheet editor.
Exporting Any list view or search result can be exported to CSV by clicking an icon. This is nice because it allows for exporting all or only the specific data that is needed. So for Directus, the filtered view would export the data returned from the filtered search and only include the columns that are selected. Also include the database record id.
Importing The first step of the import process you select the database table and upload the CSV file. When importing you have the option to process the CSV data as Create, Update or Both (for "both", if the id column on the CSV row is not blank it is an update otherwise it is an insert). All inserts or edits from the csv import are processed similar to if they were done through other methods (UI or API). Meaning they trigger validation and/or other events setup on the table. So basically with Directus, this would mean processing each row of the CSV as if it was a request from the API and call any actions and filters for the collection. (this could be a user choice whether they want these applied to the import or not) This would also run the import using the permissions of the current user so importing could be allowed for any user.
For error handling, the entire csv is processed and if any rows failed, a csv result is generated which only includes the rows that failed and a reason column is added. This allows for easily fixing the bad data and re-attempting only those records.
Thank you @mattgaspar — this is a very clear and concise description of an ideal import/export flow... I love it! I've added this as a priority feature since I think it's one of the more useful things for all users to have.
One thing that would need to consider is how relational data is imported/exported. I've drafted a rough design using the upcoming Page Info Sidebar Components, but this could also be an Action Button in the page header that opens a modal:
Looks good!
Some other functionality that would be useful for the import to make it user friendly:
After clicking import, read the first row of the CSV which defines the column names (could be server side or possibly done on the client but could be problematic for very large files - some simple validations of the csv file could also be done here). Allow the user to select which columns they want to import and allow mapping the csv column name to the field in Directus. It could provide default mappings if the column names match fields in the collection.
I find this type of functionality useful for doing mass updates. I can export the data set, update in Excel and then only import a specific column without needing to remove the extra columns from the csv.
For "many to one" fields you'll have to decide if you want to allow the user to use text which would require a query to identify the id to use. My preference is that the mapping interface allows you to choose. But there might need to be a way to include the id's for these fields when exporting.
Just thought of another issue that will probably be the trickiest part to dealing with csv files. Every value in the csv file is basically a string. So based on the field type, each value will need to be converted to the correct data type. For dealing with arrays you can treat the cell value as it's own comma separated list!
@mattgaspar can you send me a sample CSV of what you mean, I can test against my CSV import extension/function I installed on v6.x.
I think it accommodated for escaped quotes but it would help to test so the conversion method can be updated.
I think the only issue I had was the Alias
columns, it exported INT
values, it was tricky getting the relationship data back into the DB if importing.
One thing that would need to consider is how relational data is imported/exported. I've drafted a rough design using the upcoming Page Info Sidebar Components, but this could also be an Action Button in the page header that opens a modal:
The way I did in the last version was to export the raw alphanumeric values, as the user would need to export the junction collection(s) for it to make sense in a spreadsheet and also to facilitate re-importing.
Also, are we going the route of making this an admin endpoint, an extension, or a core feature accessible from any collection? I think the admin endpoints would make it more of a module - vue component - so it is easy to debug or update.
I can port the v6 CSV extension over if @benhaynes is down to collaborate on wire-framing and storyboarding.
@philleepflorence does your CSV import functionality operate against the database directly or does it process the update through Directus core so that permissions for each row are verified and any hooks are called?
@mattgaspar it works through the Directus Zend core via the authenticated user's credentials with a depth of 0; It simply converts the JSON to CSV after retrieving the response.
The same goes for imports.
That way it does not matter the actual DB type!
@philleepflorence sorry if I'm off base, i'm new to using Directus and I don't know much about Zend but I thought it was mostly used to generate the sql queries. Are the role based permissions implemented in the Zend component? I didn't think they were implemented at the database level. I'm referring to the status based workflow permissions.
@mattgaspar oh okay, yeah it uses the Directus core DB function to fetch or update items, i.e. just like using the API.
Hey @mattgaspar — we would definitely want all imports/exports to go through the full API so that permissions (etc) are honored.
Just thought of another issue that will probably be the trickiest part to dealing with csv files. Every value in the csv file is basically a string. So based on the field type, each value will need to be converted to the correct data type.
This shouldn't be an issue, since we'll confirm column/field mapping... and each field has a set datatype in Directus. That means that we'll know how to cast each value (eg: string
, int
).
@philleepflorence — I see this as a core set of API endpoints, that can then be used directly on the Core App listing pages... and could also have a dedicated module (extension) for power-user or global exports/imports. Does that make sense? @rijkvanzanten @bjgajjar do you agree?
@benhaynes it makes sense...although it might complicate things as opposed to a strict module for admins or those with permissions to that endpoint. If part of the core would anyone with read access - permission
be able to export? I know import would be update or create - permissions
access. But I would allow @rijkvanzanten and @bjgajjar to comment first since they would know the intricacies of the other endpoints.
If an admin endpoint, that is something I can easily tackle over a weekend since I already have most of the code for the CSV extension.
True, an admin page would be easier to build... but I think there are a lot of users who would take advantage of an import/export feature. And there's no reason to limit it to Admins if we can have everything safely run through permissions. Also, we would want to avoid direct SQL queries since those don't trigger our accountability logic (change log).
Perhaps we can apply a similar approach to the endpoints so this is a solid extensible feature.
@bjgajjar do you agree?
LGTM. Let @rijkvanzanten provide his thoughts :)
Having it as an option in the item browse page's sidebar is more useful
I agree — if the users have permission to access (READ) the data, there's no reason not to let them export it in alternate formats. Similarly, if they have CREATE/UPDATE permissions, they should be able to import data. This is a very useful feature for average users who might have Spreadsheets (or other external apps that save in CSV) as part of their workflow.
In fact, this would be a nice and easy way to migrate data from other systems (because most of them already have similar functionalities)
Any progress on the view side of this feature. We are a lot of users of Directus waiting for an Export to CSV solution
No new updates yet. I know there's a lot of people who want this, but there's also a lot of people who want bugs fixed, or want other features, ~or want me to get them coffee~ want better integrations etc etc. We're doing what we can with the resources we have, but I have to prioritize bugs, general stability, and adding tests before anything else.
@yagobski if you wanna help out, it would be great if you could write out a spec (rfc) for how this would ideally work for your use case 😊 (as detailed as possible).
In the meantime until added to the core there are extensions for CSV import and export here:
Thanks, @philleepflorence I tried your extensions but don't seem to work with directus-8. the best case for me is just a button Export to CSV near the table view that will export the data of columns that I see on the table. It means if apply filtering or ordering that will be the exact data what i get on CSV.
Another reason this feature isn't getting AS much attention as it should, is that you can technically work around it in the database directly (since Directus is pure SQL). Other features/bugs are Directus specific and only work if we fix/add them... but this can be done with a database administration tool (less convenient, I know). Just trying to help add some context here.
Is it okay, if I ask here? I'm setting up Directus for one of our projects and want to share the schema with other devs.
Schema only Just import/export your table setup & corresponding records in directus_* tables
It's fine for me to do that with mysqldump, but I'm not quite sure which directus_
tables would be considered part of the schema, and which are "data"? Has anyone worked that out yet?
So far, I would dump all directus tables with data, except for directus_activity
, directus_files
, directus_revisions
, directus_user_sessions
which I would dump with mysqldump --no-data
:
# make a dump of directus tables
mysqldump db directus_collection_presets directus_collections directus_fields directus_folders directus_migrations directus_permissions directus_relations directus_roles directus_settings directus_users directus_webhooks > schema/db-schema-dump.sql
# make a dump of directus tables, without data
mysqldump --no-data db directus_activity directus_files directus_revisions directus_user_sessions >> schema/db-schema-dump.sql
# make a dump of all non-directus tables, without data
mysqldump --no-data db $(mysql --batch --silent --execute "SELECT TABLE_NAME FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_SCHEMA = 'db' AND TABLE_NAME NOT LIKE 'directus\_%';") >> schema/db-schema-dump.sql
@darioseidl While the exact split between "data" and "schema" is a little vague ,this is a rough overview (all collections prefixed by directus_
:
Collection | Description |
---|---|
activity | Activity log; who did what when |
collection_presets | Last state of layouts and saved bookmarks |
collections | What collections exist / are managed |
fields | Fields and their configuration |
folders | Folders to group files * |
migrations | Keeps track of what migrations have run |
permissions | What the user roles are allowed to access |
relations | How certain fields relate to eachother |
revisions | Stores item deltas on crud operations |
roles | What user roles exist in the platform |
settings | key/value of general settings |
user_sessions | What users are current logged in |
users | Users and their login info |
webhooks | Webhooks and when they fire |
* Not yet in use
As for which ones to migrate, I'd generally recommend all of them in order to prevent mismatches in foreign keys between the tables. That being said, collection
fields
, and relations
are definitely required for "schema" stuff (without those, you'd lose info like what interface is used, order on the form etc).
users
/ roles
/ permission
could be truncated, but you'd have no way to login, so make sure you have at least 1 active user and the admin/public roles present in roles
The others can technically be empty and still have directus work with the schema in place.
Thanks @rijkvanzanten ! That helps me a lot to understand the tables.
I was worried about mismatched foreign keys, but so far the above script (skipping the data in directus_activity
, directus_files
, directus_revisions
, directus_user_sessions
) has worked well for me.
As the core will be moved to Laravel, maybe this can help: https://docs.laravel-excel.com/3.1/getting-started/
I was also hoping something a like would be implemented already - currenlty using Directus for translations only and using the API endpoints to extract and then transform them to a i18n compatible layout - next step, do it in reverse :joy:
+1 for this feature - would be extremely helpful!
Has there been any work done on this? Would be very helpful to export the data on the screen to a tsv or a csv.
This has been added in v9, but will not be back-ported to v8 (unless done so by the community).
Hi @benhaynes Do you have any roadmap for the v9, where that import/export feature is included ?
Thanks a lot !
There's no roadmap yet, but we're working on adding one. Export and backup are already in v9, import might come a bit later.
It would be extremely cool if Directus had a native way of exporting and importing (existing) data and whole Directus setups. Moving from a locally installed Directus to hosted should/could be as easy as hitting export, logging in to hosted, and hitting import.
Different "types" of import/export?:
directus_*
tablesThoughts @WellingGuzman @benhaynes ?