immunomind / immunarch

🧬 Immunarch: an R Package for Fast and Painless Exploration of Single-cell and Bulk T-cell/Antibody Immune Repertoires
https://immunarch.com
Apache License 2.0
312 stars 65 forks source link

Can immunarch 0.9.0 repLoad() .gz archives? #336

Closed github-wow closed 1 year ago

github-wow commented 1 year ago

repLoad() of .tsv.gz archives resulted in discrepancy between the list of files and corresponding file names in meta

== Step 1/3: loading repertoire files... ==

Processing "<initial>" ...
-- [1/N] Parsing "./metadata.tsv" -- metadata
-- [2/N] Parsing "./s1.tsv.gz" -- mixcr 
...
-- [N/N] Parsing "./sN.tsv.gz" -- mixcr     

== Step 2/3: checking metadata files and merging files... ==

Processing "<initial>" ...
  -- Samples found in the metadata, but not in the folder:
     s1.tsvss2.tsvss3.tsvs...sN.tsv
  Did you correctly specify all the sample names in the metadata file?
  -- Samples found in the folder, but not in the metadata:
     s**1**s2s3...s**N**
  Did you add all the necessary samples to the metadata file with correct names?
  Creating dummy sample records in the metadata for now...

== Step 3/3: processing paired chain data... ==

Done!

There were 18 warnings (use warnings() to see them)
Alexander230 commented 1 year ago

Hello, @github-wow!

I've made some improvements in path parsing and error messages for loading samples from multiple files. You can install the development version of immunarch with these changes like described here https://immunarch.com/#latest-pre-release-on-github

It works well with my data. If this version still doesn't work with your data, you can try to manually edit sample names in the metadata file to make them match the names from the error message. Or you can attach an example data to reproduce this error, I can help with loading it into immunarch.

Best regards, Aleksandr

github-wow commented 1 year ago

@Alexander230 : thanks a lot for the fix: worked nicely