soedinglab / MMseqs2

MMseqs2: ultra fast and sensitive search and clustering suite
https://mmseqs.com
GNU General Public License v3.0
1.39k stars 195 forks source link

Ability to export TSVs with header columns #434

Open Benjamin-Lee opened 3 years ago

Benjamin-Lee commented 3 years ago

Expected Behavior

Maybe this exists and I can't find it by searching the manual and the CLI reference, but I would like a way to make the first row of TSV outputs the column names. This would make exploration using tools such as Pandas or Visidata a lot easier.

Current Behavior

The first line of the TSV file is the first row of the data.

Context

Right now, if I want to use a tool such as Pandas to analyze mmseqs output, I have to manually pass in the header columns. Worse, when I went to share the data with a collaborator, I had to tell him the columns separately. This is a brittle approach, both for data reuse and archiving.

Your Environment

Include as many relevant details about the environment you experienced the bug in.

martin-steinegger commented 3 years ago

@Benjamin-Lee yes, I agree this would be great but there is currently no way in MMseqs2 to add header lines. We already discussed this and we might add the feature in future releases.