Closed lydiayliu closed 2 years ago
My guess is the first error message is printed by the worker process/thread and the second is printed by the main thread. The binary might be because when data is pickled/unpicked and transferred between threads it gets somehow messed up.
I opened an issue at uqfoundation/pathos#228
Adding another case here for
a=/hot/users/yiyangliu/MoPepGen/Parser/VEP/gencode/gsnp/CPCG0249.gencode.tsv.s.gvf
[ 2022-01-13 17:36:55 ] 16000 transcripts processed.
[ 2022-01-13 17:37:13 ] Exception raised from fusion FUSION-ENSG00000118260.15:47919-ENSG00000227308.2:22649
An error has occured during the function execution
Traceback (most recent call last):
File "/usr/local/lib/python3.8/site-packages/ppft/__main__.py", line 111, in run
__result = __f(*__args)
File "/usr/local/lib/python3.8/site-packages/moPepGen/cli/call_variant_peptide.py", line 191, in wrapper
return call_variant_peptides_wrapper(*dispatch)
File "/usr/local/lib/python3.8/site-packages/moPepGen/cli/call_variant_peptide.py", line 160, in call_variant_peptides_wrapper
_peptides = call_peptide_fusion(
File "/usr/local/lib/python3.8/site-packages/moPepGen/cli/call_variant_peptide.py", line 357, in call_peptide_fusion
dgraph.create_variant_graph(
File "/usr/local/lib/python3.8/site-packages/moPepGen/svgraph/ThreeFrameTVG.py", line 916, in create_variant_graph
cursors = self.apply_fusion(
File "/usr/local/lib/python3.8/site-packages/moPepGen/svgraph/ThreeFrameTVG.py", line 555, in apply_fusion
insertion_variants = variant_pool.filter_variants(
File "/usr/local/lib/python3.8/site-packages/moPepGen/seqvar/VariantRecordPool.py", line 177, in filter_variants
gene_id = self.anno.transcripts[tx_id].transcript.gene_id
KeyError: 'ENST00000607654.1'
same transcript is hit in 4 threads producing 4 errors, some with the strange symbols in between
Case one (CPCG0324) seems also to be fixed by #339. Fun fact, for this fusion, the donor part has 1300 bases, the accepter's exonic sequence has 1710 bases, but the intronic region carried over from the accepter gene has 91093 bases 😂
Case 2 also fixed!
the intronic region carried over from the accepter gene has 91093 bases
lmao! it's these introns that are making fusion run time super slow lolll
gimme a sec to double check both of these!
both cases confirmed resolved. wow #339 is the bomb
please also check out the very end of the log message. I believe the error is reported on two threads, but there is a lot of strange characters
^@^@^@^@^@^@^@^@^@^@^
in the log...Also with the new multiprocessing, error reporting always happens twice. The above error was written to the LOG file (so
stdout
), but on the terminal you also get this below (which isstderr
). Is this split design intentional?