bitwarden / clients

Bitwarden client apps (web, browser extension, desktop, and cli).
https://bitwarden.com
Other
9.11k stars 1.21k forks source link

URGENT Web CSV Import from LastPass CORRUPTS USER DATA with no errors or warnings. #4912

Open donpellegrino opened 1 year ago

donpellegrino commented 1 year ago

Steps To Reproduce

  1. Follow the steps in the documentation at https://bitwarden.com/help/import-from-lastpass/
  2. Observe that the imported data is badly corrupted for hundreds of records.
  3. Observe that many nonsensical Folders are created, each needing to be subsequently deleted manually due to a lack of bulk operations on Folders.
  4. Observe that backing out the operation as a transaction is impossible.
  5. Observe that both the imported data is corrupted and that any data already in the Vault is impossible to manage as a result of being combined with the corrupt data.

One possible cause of this is the handling of newlines in the comment field of LastPass records. When a newline is encountered it seems that Bitwarden starts processing a new record instead of handling a note field with multiple lines.

Expected Result

A lossless import is the only acceptable result. In the cases where LastPass has failed to define an unambiguous export, errors and warning need to be reported to alert the user. It must be possible to undo the full load if it fails.

Actual Result

Bitwarden web is currently operating in a worst-case scenario. Data is corrupted and no information about the corruption is being communicated to the user. This feature should be disabled pending a fix.

Screenshots or Videos

No response

Additional Context

No response

Operating System

Windows

Operating System Version

No response

Web Browser

Microsoft Edge

Browser Version

No response

Build Version

Version 2023.2.0

Issue Tracking Info

donpellegrino commented 1 year ago

A work-around to this critical defect is to use the Bitwarden Importer utility. The documentation at https://bitwarden.com/help/import-from-lastpass/ should be rewritten to warn users of the critical defects in the web import from LastPass and instruct them to use the Bitwarden Importer tool.

donpellegrino commented 1 year ago

Attachments LastPass attachments also need to be addressed, if only through an alert that the operation does not include them. Additional detail on the limitations of extracting attachments from LastPass can be found at https://stackoverflow.com/questions/67098639/how-to-find-all-records-in-lastpass-that-have-files-attachments-in-them

joshuabjordan commented 1 year ago

Hi there,

I am unable to reproduce this issue; it has been escalated for further investigation. If you have more information that can help us, please add it below.

Thanks!

donpellegrino commented 1 year ago

A challenge with sharing the original input for reproducibility is of course that is contains real account information and about 1,000 records. I will need to generate synthetic data and a fake account on Bitwarden to reproduce. However, I suspect the root cause is records in the LastPass .csv output where the "extra" value goes multiple lines or contains commas. At a glance, the CSV output from LastPass leaves opportunities for processing ambiguities.

Just to confirm, the CSV headers are: url,username,password,totp,extra,name,grouping,fav

In addition, I recall the import simply did not handle folder hierarchy. Instead, folder/subfolder names were combined into one item.

One approach I could try is simply putting the file I have through a CSV parser and checking the parse errors.

Any pointers to the code where LastPass CSV parsing is done would be helpful. Maybe isolating the execution of the parser on my existing file would help me craft a succinct test case.

joshtrichards commented 1 year ago

Just a BW user myself, but the LP parsing is done here:

https://github.com/bitwarden/clients/blob/master/libs/common/src/importers/lastpass-csv-importer.ts

I believe this is used for both web and CLI import workflows.

The base parser underneath all that is Papa Parse (https://github.com/mholt/PapaParse)

The dedicated BW Exporter+Importer is brand-new (the one in its own repo) and I haven't used it, but I believe it invokes the BW CLI import command so it should use the same parsing code.

Your CSV headers look fine to me.

I agree it can be challenging to collaborate on isolating issues with real data. I think you're on the right path with moving reproducing to a test account. Attempting to creating some synthetic data also probably a good step, but further isolation using real data may be necessary first (but within the test account). One idea is to break down your LP import file into 2 or 3 (or 5-10 - whatever it takes) separate imports (adding the CSV header to each one) and seeing which ones it occurs in / doesn't occur in.

kenstir commented 2 months ago

TL;DR

I also experienced corruption of ~5% of items imported directly from LastPass using the Chrome extension.

When I tried to reproduce the issue using CSV import, there was no corruption I could find.

Details

I have experienced this too when using the direct import from LastPass. Out of 705 items, roughly 40 of them had corrupted URIs. For example: 2024-08-13 15_42_56 If I copy that URI and paste it into emacs, I get two bytes: 0x09 0x02

To fix the issue, I exported LP to CSV and Bitwarden to CSV and compared the two. I noticed many unusual things in the LP CSV export:

So I tried to reproduce the problem using a CSV import, using both the web vault and the Chrome extension, and the corruption didn't happen, even when importing the entire CSV file.