milaboratory / mixcr

MiXCR is an ultimate software platform for analysis of Next-Generation Sequencing (NGS) data for immune profiling.
https://mixcr.com
Other
336 stars 79 forks source link

Postanalysis: preprocessing datasets progress unknown #1822

Open Januaryyiyue opened 1 month ago

Januaryyiyue commented 1 month ago

Hi,

While running mixcr postanalysis, I ran into an issue of "Preprocessing datasets: progress unknown".

My metadata.tsv is formatted like this:

sample  patient timepoint
Patient-0004-T2-P-Library1-capTCR   P0004_Library1  T2
Patient-0004-T2-P-Library2-capTCR   P0004_Library2  T2
Patient-0006-T1-P-Library1-capTCR   P0006_Library1  T1
...

My postanalysis command is like this:

java -jar /path/to/mixcr/4.6.0/mixcr.jar postanalysis individual --default-downsampling count-read-auto --default-weight-function read --metadata /path/to/metadata.tsv --only-productive --drop-outliers --tables /path/to/postanalysis_output/pa_read.i.tsv --preproc-tables /path/tp/postanalysis_output/preproc_read.i.tsv /path/to/analyze_output/*.clns /path/to/postanalysis_output/individual_read.json.gz

And the error message is like this:

The following chains present in the data: [TRA, TRG, TRB, TRD]
Running for chains=TRA
Preprocessing datasets: progress unknown
Running for chains=TRG
Preprocessing datasets: progress unknown
Running for chains=TRB
Preprocessing datasets: progress unknown
Running for chains=TRD
Preprocessing datasets: progress unknown

When I check my diversity output files, the calculations did not show up:

sample  Observed diversity  Shannon-Wiener diversity    Normalized Shannon-Wiener index Inverse Simpson index   Gini index  Chao1 estimate  Efron-Thisted estimate  d50
Patient-0004-T2-Library1-capTCR_mixcr_out.clns  0.0 1.0 -0.0    -Infinity   1.0 0.0 0.0 0.0
mizraelson commented 1 month ago

Is it possible that there were no clones for this chain in Patient-0004-T2-Library1-capTCR_mixcr_out.clns ?

Januaryyiyue commented 1 month ago

Hi,

I have 18 samples in total, and one of them does not have TRB clones.

mizraelson commented 1 month ago

Does it work if you run all other samples but this one?

Januaryyiyue commented 1 month ago

I tried it and it's still not working

mizraelson commented 2 weeks ago

Hi, It seems that this issue might be caused by the following parameters:

--default-downsampling count-read-auto
--drop-outliers

Could it be that the samples have a low abundance of clonotypes? If that’s the case, you might lose a significant portion of the data. Have you tried running the analysis without these two parameters? Do you get output for all samples in that case?