The same holds true for not using the voxel-layers argument.
However, the csv file is bigger
(Roan) [rvanscheppingen@n0118 All_transcripts]$ zcat full_transcripts.csv.gz | wc -l
22073511
Is there a subsampling, or does it take only transcripts on a certain distance from polygons?
I've seen similar behaviour before, but that differed 20K transcripts on a total of 1 million (same dataset). Currently I assume that there are transcripts very far away from cells and therefore not taken into account. I'll let it run and inspect later.
Edit; upon further inspection I see that --coordinate-scale is detrimental. Reducing this to the recommended 0.12 for cosmx data returns close to 22 million transcripts. Not sure if this is how it's intended to influence the max distance to a cell.
Using
(Roan) [rvanscheppingen@n0118 All_transcripts]$ proseg full_transcripts.csv.gz --cosmx --coordinate-scale 1 --output-maxpost-counts counts.csv.gz --nthreads 14 --voxel-layers 15
The same holds true for not using the voxel-layers argument. However, the csv file is bigger
(Roan) [rvanscheppingen@n0118 All_transcripts]$ zcat full_transcripts.csv.gz | wc -l
Is there a subsampling, or does it take only transcripts on a certain distance from polygons? I've seen similar behaviour before, but that differed 20K transcripts on a total of 1 million (same dataset). Currently I assume that there are transcripts very far away from cells and therefore not taken into account. I'll let it run and inspect later.
Edit; upon further inspection I see that --coordinate-scale is detrimental. Reducing this to the recommended 0.12 for cosmx data returns close to 22 million transcripts. Not sure if this is how it's intended to influence the max distance to a cell.