Closed pradyumnasagar closed 7 years ago
Got the same error in sample data also.
Did the "tsg" command fail for you too?
This is kind of an odd error. I don't get this error when I run version 1.0.7 on the quick start example. Could you check the unit tests on the package to see if you get the same error message? The unit tests have been continuously tested on python 2.7 and 3.5, and have not shown an error. If you get an error on the unit tests, it might be an installation problem or at least help me debug what is happening.
This is how you would run the unit tests:
pip uninstall probabilistic2020
)pip install nose
). Hopefully should be latest version 1.3.7.wget https://github.com/KarchinLab/probabilistic2020/archive/v1.0.7.tar.gz
tar xvzf v1.0.7.tar.gz ; cd probabilistic2020-1.0.7
make build
make tests
yes even tsg command gave the same error "ValueError: Chasm context requires a three nucleotide string (Provided: "") "
No errors were observed when I run the unit test but there were some warnings.
make tests
nosetests --nologcapture tests/
.[fai_load] build FASTA index.
/usr/local/lib/python2.7/dist-packages/pandas/core/indexing.py:296: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy self.obj[key] = _infer_fill_value(value) /usr/local/lib/python2.7/dist-packages/pandas/core/indexing.py:476: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy self.obj[item] = s .[fai_load] build FASTA index. /home/mlscl3/2020/probabilistic2020-1.0.7/prob2020/python/count_frameshifts.py:47: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy gene_df['unmapped'] = [(1 if x is None else 0) for x in fs_pos] ..[fai_load] build FASTA index. .[fai_load] build FASTA index. ..[fai_load] build FASTA index. /home/mlscl3/2020/probabilistic2020-1.0.7/prob2020/python/p_value.py:22: RuntimeWarning: divide by zero encountered in log chisq_stat = np.sum(-2*np.log(pvals)) .......[fai_load] build FASTA index. /home/mlscl3/2020/probabilistic2020-1.0.7/prob2020/python/indel.py:177: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy mut_df['indel type'] = '' /home/mlscl3/2020/probabilistic2020-1.0.7/prob2020/python/mutation_context.py:83: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead
Ran 15 tests in 267.090s
OK
So you ran the quick start command verbatim? Do you get the error when you replace the quick start example "probabilistic2020 tsg" with "prob2020/console/probabilistic2020.py tsg" (from the source code you download)?
yes it has the same error when I try to run from source with my data.
prob2020/console/probabilistic2020.py tsg -i 2020.fa -b 2020.bed -m 2020.maf -o 2020testout.txt
Version: 1.0.7
Command: prob2020/console/probabilistic2020.py tsg -i 2020.fa -b 2020.bed -m 2020.maf -o 2020testout.txt
Kept 1107 mutations after droping mutations with missing information (Droped: 0)
Dropped 33 mutations after only keeping Missense_Mutation, Silent, Nonsense_Mutation, Splice_Site, Nonstop_Mutation, Translation_Start_Site. Indels are processed separately.
Dropped 0 mutations after only keeping valid SNVs
Pseudo Random Number Generator Seed: 101
Working on chromosome: chr1 . . .
Finished working on chromosome: chr1.
Working on chromosome: chr3 . . .
Finished working on chromosome: chr3.
Working on chromosome: chr2 . . .
Finished working on chromosome: chr2.
Working on chromosome: chr6 . . .
Finished working on chromosome: chr6.
Working on chromosome: chr19 . . .
Finished working on chromosome: chr19.
Working on chromosome: chr17 . . .
Finished working on chromosome: chr17.
Working on chromosome: chr7 . . .
Finished working on chromosome: chr7.
Working on chromosome: chr8 . . .
Chasm context requires a three nucleotide string (Provided: "")
Traceback (most recent call last):
File "/home/mlscl3/2020/probabilistic2020-1.0.7/prob2020/console/../../prob2020/python/utils.py", line 131, in wrapper
result = f(*args, *kwds)
File "/home/mlscl3/2020/probabilistic2020-1.0.7/prob2020/console/../../prob2020/console/randomization_test.py", line 51, in singleprocess_permutation
sc = SequenceContext(gs, seed=opts['seed'])
File "/home/mlscl3/2020/probabilistic2020-1.0.7/prob2020/console/../../prob2020/python/sequence_context.py", line 12, in init
self._init_context(gene_seq)
File "/home/mlscl3/2020/probabilistic2020-1.0.7/prob2020/console/../../prob2020/python/sequence_context.py", line 100, in _init_context
first_context = prob2020.python.mutation_context.get_chasm_context(first_nucs)
File "/home/mlscl3/2020/probabilistic2020-1.0.7/prob2020/console/../../prob2020/python/mutation_context.py", line 138, in get_chasm_context
'(Provided: "{0}")'.format(tri_nuc))
ValueError: Chasm context requires a three nucleotide string (Provided: "")
Traceback (most recent call last):
File "prob2020/console/probabilistic2020.py", line 266, in
I'm guessing 2020.fa, 2020.bed, and 2020.maf are your own data. Can you run the command on the quick start example data?
thanks for the help, it ran perfectly with quick start example data, changed my data format accordingly and now it is working fine
Could you tell me what the data format problem was? I might be able to update the documentation or modify the code to give a more informative error message.
When i try to run probabilistic2020 with my data it ran into the following error
`Version: 1.0.7 Command: /usr/local/bin/probabilistic2020 oncogene -i 2020.fa -b 2020sort.bed -m 2020.maf -c 1.5 -p 10 -o oncogene_output_2020.txt Kept 1107 mutations after droping mutations with missing information (Droped: 0) Dropped 0 mutations after only keeping Missense_Mutation, Silent, Nonsense_Mutation, Splice_Site, Nonstop_Mutation, Translation_Start_Site. Indels are processed separately. Dropped 0 mutations after only keeping valid SNVs Pseudo Random Number Generator Seed: 101 Working on chromosome: chr1 . . . Working on chromosome: chr3 . . . Working on chromosome: chr2 . . . Working on chromosome: chr6 . . . Working on chromosome: chr19 . . . Working on chromosome: chr17 . . . Working on chromosome: chr7 . . . Working on chromosome: chr8 . . . Working on chromosome: chr9 . . . Working on chromosome: chr11 . . . Chasm context requires a three nucleotide string (Provided: "") Traceback (most recent call last): File "/usr/local/lib/python3.4/dist-packages/prob2020/python/utils.py", line 131, in wrapper result = f(*args, kwds) File "/usr/local/lib/python3.4/dist-packages/prob2020/console/randomization_test.py", line 51, in singleprocess_permutation sc = SequenceContext(gs, seed=opts['seed']) File "/usr/local/lib/python3.4/dist-packages/prob2020/python/sequence_context.py", line 12, in init self._init_context(gene_seq) File "/usr/local/lib/python3.4/dist-packages/prob2020/python/sequence_context.py", line 100, in _init_context first_context = prob2020.python.mutation_context.get_chasm_context(first_nucs) File "/usr/local/lib/python3.4/dist-packages/prob2020/python/mutation_context.py", line 138, in get_chasm_context '(Provided: "{0}")'.format(tri_nuc)) ValueError: Chasm context requires a three nucleotide string (Provided: "") Finished working on chromosome: chr3. Finished working on chromosome: chr19. Finished working on chromosome: chr17. Finished working on chromosome: chr6. Chasm context requires a three nucleotide string (Provided: "") Traceback (most recent call last): File "/usr/local/lib/python3.4/dist-packages/prob2020/python/utils.py", line 131, in wrapper result = f(*args, *kwds) File "/usr/local/lib/python3.4/dist-packages/prob2020/console/randomization_test.py", line 51, in singleprocess_permutation sc = SequenceContext(gs, seed=opts['seed']) File "/usr/local/lib/python3.4/dist-packages/prob2020/python/sequence_context.py", line 12, in init self._init_context(gene_seq) File "/usr/local/lib/python3.4/dist-packages/prob2020/python/sequence_context.py", line 100, in _init_context first_context = prob2020.python.mutation_context.get_chasm_context(first_nucs) File "/usr/local/lib/python3.4/dist-packages/prob2020/python/mutation_context.py", line 138, in get_chasm_context '(Provided: "{0}")'.format(tri_nuc)) ValueError: Chasm context requires a three nucleotide string (Provided: "") Finished working on chromosome: chr7. Finished working on chromosome: chr2. Finished working on chromosome: chr11. Finished working on chromosome: chr1. multiprocessing.pool.RemoteTraceback: """ Traceback (most recent call last): File "/usr/lib/python3.4/multiprocessing/pool.py", line 119, in worker result = (True, func(args, kwds)) File "/usr/local/lib/python3.4/dist-packages/prob2020/python/utils.py", line 131, in wrapper result = f(*args, **kwds) File "/usr/local/lib/python3.4/dist-packages/prob2020/console/randomization_test.py", line 51, in singleprocess_permutation sc = SequenceContext(gs, seed=opts['seed']) File "/usr/local/lib/python3.4/dist-packages/prob2020/python/sequence_context.py", line 12, in init self._init_context(gene_seq) File "/usr/local/lib/python3.4/dist-packages/prob2020/python/sequence_context.py", line 100, in _init_context first_context = prob2020.python.mutation_context.get_chasm_context(first_nucs) File "/usr/local/lib/python3.4/dist-packages/prob2020/python/mutation_context.py", line 138, in get_chasm_context '(Provided: "{0}")'.format(tri_nuc)) ValueError: Chasm context requires a three nucleotide string (Provided: "") """
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "/usr/local/bin/probabilistic2020", line 11, in
sys.exit(cli_main())
File "/usr/local/lib/python3.4/dist-packages/prob2020/console/probabilistic2020.py", line 262, in cli_main
main(opts)
File "/usr/local/lib/python3.4/dist-packages/prob2020/console/probabilistic2020.py", line 210, in main
result_df = rt.main(opts, mutation_df)
File "/usr/local/lib/python3.4/dist-packages/prob2020/console/randomization_test.py", line 389, in main
permutation_result = multiprocess_permutation(bed_dict, mut_df, opts)
File "/usr/local/lib/python3.4/dist-packages/prob2020/console/randomization_test.py", line 152, in multiprocess_permutation
for chrom_result in process_results:
File "/usr/lib/python3.4/multiprocessing/pool.py", line 689, in next
raise value
ValueError: Chasm context requires a three nucleotide string (Provided: "")
`