mathesar-foundation / mathesar

Web application providing an intuitive user experience to databases.
https://mathesar.org/
GNU General Public License v3.0
2.28k stars 316 forks source link

Implement `table.import` RPC endpoint #3623

Closed Anish9901 closed 6 days ago

Anish9901 commented 2 weeks ago

Fixes #3622

Checklist

- [x] My pull request has a descriptive title (not a vague title like `Update index.md`). - [x] My pull request targets the `develop` branch of the repository - [x] My commit messages follow [best practices][best_practices]. - [x] My code follows the established code style of the repository. - [x] I added tests for the changes I made (if applicable). - [x] I added or updated documentation (if applicable). - [x] I tried running the project locally and verified that there are no visible errors. [best_practices]:https://gist.github.com/robertpainsi/b632364184e70900af4ab688decf6f53 ## Developer Certificate of Origin
Developer Certificate of Origin ``` Developer Certificate of Origin Version 1.1 Copyright (C) 2004, 2006 The Linux Foundation and its contributors. 1 Letterman Drive Suite D4700 San Francisco, CA, 94129 Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed. Developer's Certificate of Origin 1.1 By making a contribution to this project, I certify that: (a) The contribution was created in whole or in part by me and I have the right to submit it under the open source license indicated in the file; or (b) The contribution is based upon previous work that, to the best of my knowledge, is covered under an appropriate open source license and I have the right under that license to submit that work with modifications, whether created in whole or in part by me, under the same open source license (unless I am permitted to submit under a different license), as indicated in the file; or (c) The contribution was provided directly to me by some other person who certified (a), (b) or (c) and I have not modified it. (d) I understand and agree that this project and the contribution are public and that a record of the contribution (including all personal information I submit with it, including my sign-off) is maintained indefinitely and may be redistributed consistent with this project or the open source license(s) involved. ```
Anish9901 commented 2 weeks ago

You'd run that function using Cursor.copy.

That's an interesting idea @mathemancer, I wasn't aware that Cursor.copy could be used to execute functions similar to Cursor.execute. I'll give it a shot!

mathemancer commented 2 weeks ago

You'd run that function using Cursor.copy.

That's an interesting idea @mathemancer, I wasn't aware that Cursor.copy could be used to execute functions similar to Cursor.execute. I'll give it a shot!

I double-checked since I wasn't sure. You can't actually run multiple operations using Cursor.copy, so you'd still have to create the table, then do the copy operation as a separate call. I still think it would be more robust to do the escape/quote wrangling on the DB, but now I'm much less convinced it's worth the effort to move the logic there. Use your judgement w.r.t. whether you want to do the string composition in python (but much more carefully) or on the DB.

Anish9901 commented 1 week ago

@mathemancer I ended up choosing the 2nd option since that seemed more robust here are some sample returns from the SQL function:

For a TSV file:

{
    'copy_sql': 'COPY public."tsv import 2" ("Center", "Status", "Case Number", "Patent Number", "Application SN", "Title", "Patent Expiration Date") FROM STDIN CSV HEADER DELIMITER \'\t\' QUOTE \'"\' ENCODING \'UTF8\'',
    'table_oid': '2162234'
}

For a CSV file:

{
    'copy_sql': 'COPY public."cSv import 1" ("Integer", "Boolean", "Text", "Decimal") FROM STDIN CSV HEADER DELIMITER \',\' ENCODING \'utf-8\'',
    'table_oid': '2162409'
}