Open jflucier opened 3 months ago
@jflucier Is your DIA-NN input file generated by running FragPipe?
Also, which version you are using?
Hi
Is your DIA-NN input file generated by running FragPipe?
No, I run DIANN using command line on a linux cluster. Here is the command I use:
diann --threads 40 --verbose 2 \
--f $SLURM_TMPDIR/data/Fjolla_DIA_15KO_1_Slot1-32_1_24034.d \
--f $SLURM_TMPDIR/data/Fjolla_DIA_15KO_2_Slot1-33_1_24036.d \
--f $SLURM_TMPDIR/data/Fjolla_DIA_15KO_3_Slot1-34_1_24038.d \
--f $SLURM_TMPDIR/data/Fjolla_DIA_15KO_4_Slot1-35_1_24040.d \
--f $SLURM_TMPDIR/data/Fjolla_DIA_LysM_minus_1_Slot1-36_1_24046.d \
--f $SLURM_TMPDIR/data/Fjolla_DIA_LysM_minus_2_Slot1-37_1_24048.d \
--f $SLURM_TMPDIR/data/Fjolla_DIA_LysM_minus_3_Slot1-38_1_24050.d \
--f $SLURM_TMPDIR/data/Fjolla_DIA_LysM_minus_4_Slot1-39_1_24052.d \
--f $SLURM_TMPDIR/data/Fjolla_DIA_LysM_plus_1_Slot1-40_1_24055.d \
--f $SLURM_TMPDIR/data/Fjolla_DIA_LysM_plus_2_Slot1-41_1_24057.d \
--f $SLURM_TMPDIR/data/Fjolla_DIA_LysM_plus_3_Slot1-42_1_24059.d \
--f $SLURM_TMPDIR/data/Fjolla_DIA_LysM_plus_4_Slot1-43_1_24061.d \
--f $SLURM_TMPDIR/data/Fjolla_DIA_WT_1_Slot1-28_1_24025.d \
--f $SLURM_TMPDIR/data/Fjolla_DIA_WT_2_Slot1-29_1_24027.d \
--f $SLURM_TMPDIR/data/Fjolla_DIA_WT_3_Slot1-30_1_24043.d \
--f $SLURM_TMPDIR/data/Fjolla_DIA_WT_4_Slot1-31_1_24031.d \
--temp $SLURM_TMPDIR/temp \
--cut K*,R* --missed-cleavages 2 --met-excision \
--fasta "$SLURM_TMPDIR/UP000000589_10090_combo.fasta" --fasta-search \
--out-lib "$SLURM_TMPDIR/out/report-lib.tsv" --out-lib-copy \
--out "$SLURM_TMPDIR/out/report.tsv" \
--mass-acc-ms1 20 --mass-acc 20 \
--min-pep-len 7 --max-pep-len 30 \
--min-pr-charge 1 --max-pr-charge 5 \
--min-pr-mz 100 --max-pr-mz 1700 \
--min-fr-mz 100 --max-fr-mz 1500 \
--predictor --reanalyse --matrices --smart-profiling --pg-level 1 \
--unimod4 --unimod35 --var-mod UniMod:1,42.010565,*n,ntermacetyl
Also, which version you are using?
I use DIANN v1.8.1 installed inside a singularity container built using docker image.
Thank you again for your help
I was asking about the version of FragpipeAnalystR. I believe FragPipe doesn't generate report with such issue. We are willing to support DIA-NN report more but currently we don't support that yet. If you are willing to share your file, you can send it to me through email yihsiao@umich.edu
The FragPipeAnalystR version installed is 0.1.7
I will send you my analysis file directly to the provided email
Thanks again!
Hello,
I manage to get this working by filtering pg report using only proteotypic proteins groups (those without ; in protein_group name). Here is the command I used to filter:
perl -ne '
chomp($_);
my @t = split("\t",$_);
my @prot_ident = split(";",$t[0]);
if(scalar(@prot_ident) == 1){
print $_ . "\n";
}
' report.pg_matrix.tsv > report.pg_matrix.proteoptypic.tsv
Afterwards, the following command run with success:
ccrcc <- make_se_from_files(
"report.pg_matrix.proteoptypic.tsv",
"experiment_annotation.tsv",
type = "DIA",
level = "gene"
)
Hi,
When pass my DIANN result file to the
make_se_from_files
function, it returns the following error:I have trace back by executing line by line the
make_se_from_files
function and found where the error happens. It happens in themake_unique
function that returns duplicates. If I inspect the returnedproteins_unique
object, the returned ID is truncated in the case where proteins groups are composed of multiple proteins. For example:Diann protein group: ID A0A075B5M4: A0A075B5M4 A0A075B5M4;A0A0A6YYE7: A0A075B5M4
Would it be ok to prefilter diann results to remove all lines where I see a group of more then 1 protein like A0A075B5M4;A0A0A6YYE7 or it will bias results.
Thank you in advance for your help, JF