Closed cajames2 closed 6 years ago
There are a couple of issues preventing your analysis. First of all, the file you appended does not include the cdr3a_nucseq column, which is required for a number of steps in the analysis and generates errors you can find in the .err files when missing. Second, --make_fake_beta and --make_fake_quals will only function properly when using --pair_seqs_file for your data input; if you are going parse the sequences yourself, you will also need to generate your own fake qualities and fake beta chains. I'm attaching a file with with an example parsed ABtcr that might be helpful in that regard: FakesExample.txt
Hope this helps. Feel free to email me directly if you need more assistance getting things to work.
Jeremy
Hello,
I am trying to run TCR-dist on a data set of parsed TCR alpha chains.
An abridged version of my data set is here in .txt format: clones_file.txt
The code I used to run the basic analysis script is as follows:
python /Users/cajames2/tcr-dist/run_basic_analysis.py --organism human --parsed_seqs_file /Users/cajames2/TCRSeq/clones_file.tsv --make_fake_beta --make_fake_quals
The script then runs all the way through, but returns blank tables and plots. I ran the test "test_small_human_pairseqs_v1_parsed_seqs.tsv " data set and saw outputs. I also deleted beta chain columns and quality scores and ran only the alpha chain information with --make_fake_beta and --make_fake_quals and it worked just fine.
I think I have traced the issue to something that the _parse_tsvfile function is dependent on. I modified the parse_tsv.py script to print the _allclones file so that I could see whether my data was being read correctly and this file is blank after I run the run_basic_analysis.py script. However, when I run the parse_tsv.py script on my data independently, it reads my data and generates a populated _allclones file.
Do you have any insight into why the _parse_tsvfile function won't read my data when run in the context of the run_basic_analysis.py script, but works just fine when run independently?