Closed AlicePsyche closed 2 years ago
Happy to help! Can you post your samples_info.txt file-- looks like it is not able to find the barcode mapping file, but it sure looks like it is there in the directory listing. Maybe the filename for the clones file in the samples_info.txt file is not formatted correctly?
Thanks!
Here is my sample info:
clones_file gex_data gex_data_type vdj_v1_lane1_clones.tsv lane1_filtered_feature_bc_matrix.h5 10x_h5 vdj_v1_lane2_clones.tsv lane2_filtered_feature_bc_matrix.h5 10x_h5
Tab-separated.
Here is the code I ran to generate the clone files:
python ~/software/conga/scripts/setup_10x_for_conga.py --filtered_contig_annotations_csvfile /home/project/Nextseq_lane2/TCR/filtered_contig_annotations.csv --output_clones_file ./vdj_v1_lane1_clones.tsv --organism human --no_kpca --save_tcrdist_matrices &
Huh, that's a mystery. The error is definitely failure to find the barcode mapping file (line 83 of merge_samples.py). And if you look at the code you can see how the name of the barcode mapping file is created: by adding ".barcode_mapping.tsv" to the name of the clones file. Is it possible there's an extra white-space character in the clones file name? Maybe you could add a print statement before line 83, something like
print(bcmap_file)
so we can see what's going wrong?
Hmm, looks like it took the gex_data file as the clones_file?
(conga_new_env) alice@pe2:~/project/CoNGA$ vdj_v1_lane1_clones.tsv.barcode_mapping.tsv
reading: lane1_filtered_feature_bc_matrix.h5 of type 10x_h5
Variable names are not unique. To make them unique, call .var_names_make_unique
.
/home/software/conga/conga/preprocess.py:226: DeprecationWarning: Use is_view instead of isview, isview will be removed in the future.
if adata.isview: # ran into trouble with AnnData views vs copies
(43691, 20639) lane1_filtered_feature_bc_matrix.h5
vdj_v1_lane2_clones.tsv lane2_filtered_feature_bc_matrix.h5.barcode_mapping.tsv
Traceback (most recent call last):
File "/home/software/conga/scripts/merge_samples.py", line 84, in
It looks like there's a problem with the samples_info.txt file: missing a tab after the second clones file? See how there is whitespace in the filename that's printed out?
Oh thank you! It seemed that the problem is caused when running setup_10x_for_conga.py
. I put the extra./
here --output_clones_file ./vdj_v1_lane2_clones.tsv
After I reran the code, the merge_samples.py
worked well. Thanks a lot!
Hello,
Recently I got my own 10x 5' scRNAs-eq data and would like to try with CoNGA. I followed the tutorial and prepared the TCR file by running
conga/scripts/setup_10x_for_conga.py
withfiltered_contig_annotations.csv
file generated from 10x. But I had an error when merging two lanes:Given that the hd5 file is from 10x, I am not sure what variable names are not unique? Do I need to do any filtering before merging the samples?
Here is the code I ran:
python ~/software/conga/scripts/merge_samples.py --samples samples_info.txt --output_clones_file merged_lanes_clones.tsv --output_gex_data merged_lanes_gex.h5ad --organism human --output_distfile merged_lanes_gex_dist
Could you please help me take a look? Thanks a lot in advance!