Closed Januaryyiyue closed 3 months ago
To elaborate, our .clns file name is structured like this:
sample-01-P-text_text.clns
And the matadata file is structured like this:
sample,sampleType
sample-01-P-text_text,P
Hi,
First of all there is no need for
-OdiversityMeasures=diversity.observed,diversity.shannonWiener,diversity.chao1,diversity.normalizedShannonWienerIndex,diversity.inverseSimpsonIndex,diversity.giniIndex,diversity.d50,diversity.efronThisted
parameter. All metrics will be evaluated by default.
Regarding the error, it seems that in the --group sampleType
parameter, you specified a column that is not present in the metadata. Could you please check that there are no extra spaces or other issues in the metadata column name and you refer to the correct file ?
Dear @mizraelson,
Thank you very much for your time and help with this, I really appreciate it. I am working with @Januaryyiyue on this issue. I removed the -OdiversityMeasures
command and double checked the metadata file. Unfortunately I still received the same error. I am sure this column name exists in the metadata and am confident that there are no extra spaces or any issues with the metadata file after double checking. I wanted to please ask if there was any other potential thing which may be causing this issue. I'd also be happy to arrange a call to discuss further.
Here is a sample script:
#!/bin/bash
module load java/8
module load mixcr/4.6.0
java -Xmx28g -jar /cluster/tools/software/centos7/mixcr/4.6.0/mixcr.jar postanalysis individual --default-downsampling count-read-auto --default-weight-function read --metadata /cluster/projects/mixcr_postanalysis/metadata.csv --group sampleType /cluster/projects/mixcr_postanalysis/postanalysis/ALQ-02-012-T0-P-DNA-capTCR_S38.clns /cluster/projects/mixcr_postanalysis/postanalysis/ALQ-02-012-T0-P-DNA-capTCR_S38_result.json --only-productive --drop-outliers
The top of the metadata file:
sample,sampleType
ALQ-02-012-T0-P-DNA-capTCR_S38,P
ALQ-02-013-T0-P-DNA-capTCR_S26,P
The name of the corresponding clones file looks like this:
ALQ-02-012-T0-P-DNA-capTCR_S38.clns
ALQ-02-013-T0-P-DNA-capTCR_S26.clns
Below is the error:
By using this software, you agree the license at https://mixcr.readthedocs.io/en/develop/license.html
The following have been reloaded with a version change:
1) java/8 => java/18
Please copy the following information along with the stacktrace:
Version: 4.6.0; built=Sat Dec 09 14:48:42 EST 2023; rev=c9fafa41fe; lib=repseqio.v4.0
OS: Linux
Java: 18.0.1
Cmd args: postanalysis individual --default-downsampling count-read-auto --default-weight-function read --metadata /cluster/projects/mixcr_postanalysis/metadata.csv --group sampleType /cluster/projects/mixcr_postanalysis/postanalysis/ALQ-02-012-T0-P-DNA-capTCR_S38.clns /cluster/projects/mixcr_postanalysis/postanalysis/ALQ-02-012-T0-P-DNA-capTCR_S38_result.json --only-productive --drop-outliers
picocli.CommandLine$ExecutionException: Error while running command individual java.lang.NullPointerException
at com.milaboratory.mixcr.cli.Main.registerExceptionHandlers$lambda-12(SourceFile:395)
at picocli.CommandLine.execute(CommandLine.java:2088)
at com.milaboratory.mixcr.cli.Main.main(SourceFile:101)
Caused by: java.lang.NullPointerException
at com.milaboratory.mixcr.cli.postanalysis.CommandPa.groupSamples(SourceFile:286)
at com.milaboratory.mixcr.cli.postanalysis.CommandPa.run1(SourceFile:308)
at com.milaboratory.mixcr.cli.MiXCRCommandWithOutputs.run0(SourceFile:69)
at com.milaboratory.mixcr.cli.MiXCRCommand.run(SourceFile:37)
at picocli.CommandLine.executeUserObject(CommandLine.java:1939)
at picocli.CommandLine.access$1300(CommandLine.java:145)
at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2358)
at picocli.CommandLine$RunLast.handle(CommandLine.java:2352)
at picocli.CommandLine$RunLast.handle(CommandLine.java:2314)
at picocli.CommandLine$AbstractParseResultHandler.execute(CommandLine.java:2179)
at picocli.CommandLine$RunLast.execute(CommandLine.java:2316)
at com.milaboratory.mixcr.cli.Main.registerLogger$lambda-27(SourceFile:514)
at picocli.CommandLine.execute(CommandLine.java:2078)
... 1 more
App version: 4.6.0; built=Sat Dec 09 14:48:42 EST 2023; rev=c9fafa41fe; lib=repseqio.v4.0
I'd be very grateful for any assistance in solving this error. I wonder if it may have to do with the version of java we are using?
Thank you very much for your time and help with this, I really appreciate it and wish you the best,
Dory
Hi Dory,
We have found the bug, it was due to the uppercase letters in the column name. Can you please either try running the latest develop version, or use --group sampletype
instead.
Dear @mizraelson,
This worked - thank you very much! I really appreciate your time and help with this. I was able to have the script run to completion by changing to --group sampletype
and adjusting the header in the metadata file to have a lowercase T: sampletype
I just wanted to please confirm if postanalysis individual
is the best way to downsample the clones across a directory. I have 100 .clns files and want them to be downsampled relative to other files based on sample type (ie all of one sample type downsampled relative to the read counts of other .clns files of that sample type). I can adapt the script to run in overlap
mode if preferred.
Thank you very much for your time and help with this, I really appreciate it and wish you the best,
Dory
If you just want to downsample the data you can use the dedicated mixcr downsample
command.
Dear @mizraelson,
Thank you for your message and assistance! I really appreciate it. I am interested in downsampling and then running mixcr postanalysis
on the .clns files to compute statistical differences in diversity among them. Could you please confirm if it is best to first run mixcr downsample
and then mixcr postanalysis individual
on the output from mixcr downsample
? Given that mixcr postanalysis individual
only processes one .clns file at a time, I am unsure if the downsampling applied within that command can be accurately normalized relative to all other files in a directory or the metada.csv file.
Thank you very much for your time and help with this, I really appreciate it. Wishing you the best,
Dory
If you plan to run mixcr postanalysis individual
there is no need to run mixcr downsample
prior; you can use --default-downsampling count-read-auto
parameter with mixcr postanalysis individual
. mixcr postanalysis individual
takes multiple .clns files as an input. You can check this link for reference. So the data will be downsampled across all files.
Thank you very much for your reply and sharing this with me, I really appreciate all your time and help with my requests. I will use the commands in the reference link. Best wishes.
Hello,
We are using MiXCR/4.6.0 to run the postanalysis command. We got the following error:
Here's the first few lines of the metadata.csv we use:
We would like to know where this error is coming from, and what we can do to solve it. Thank you.