Closed Ulises-Rosas closed 4 years ago
Yes, the global logging variable is one per process by design. That in itself is unlikely to be the issue. It is almost certainly a bug I introduced recently when cutting down on the log chatter. I'll look into it asap.
FYI: Logging across processes is a mess because Python streams can be blocking which causes hangs. So, subprocesses log into a temp file which is then appended to the main log file upon completion.
https://github.com/juliema/aTRAM/blob/master/lib/log.py#L71-L93
I am unable to replicate the error. Would you please send me the exact command you ran to get this error. I don't need the data just the command.
Pretty much something like this:
atram_preprocessor.py --mixed-ends [fastq file]
where [fastq file]
is the input for the --mixed-ends
argument. I am also using a conda environment and a makeblastdb
version 2.9.0+
I also had to add __spec__ = None
at the __main__
conditional of atram_preprocessor.py
in order avoid other issue. Without it the LOGGER issues persist
I'm still unable to replicate the issue. I've tried various scenarios:
__spec__ = None
to the __main__
of atram_preprocessor.pyThe sequence always looks similar too...
(base) ~/work/aTRAM$ conda activate aTRAM
(aTRAM) ~/work/aTRAM$ ./atram_preprocessor.py --mixed-ends input/ptgor_small_1.fasta.gz --gzip --path ~/bin/ncbi-blast-2.10.0+/bin/
2020-04-10 11:46:58 INFO: ################################################################################
2020-04-10 11:46:58 INFO: aTRAM version: v2.3.0
2020-04-10 11:46:58 INFO: Python version: 3.8.2 (default, Mar 26 2020, 15:53:00) [GCC 7.3.0]
2020-04-10 11:46:58 INFO: ./atram_preprocessor.py --mixed-ends input/ptgor_small_1.fasta.gz --gzip --path ~/bin/ncbi-blast-2.10.0+/bin/
2020-04-10 11:46:58 INFO: Loading "input/ptgor_small_1.fasta.gz" into sqlite database
2020-04-10 11:47:55 INFO: Creating an index for the sequence table
2020-04-10 11:48:00 INFO: Assigning sequences to shards
2020-04-10 11:48:01 INFO: Making blast DBs
2020-04-10 11:48:50 INFO: Finished making blast all 9 DBs
Then, this issue is coming from my local configuration rather than aTRAM
code itself. I'm not sure how to replicate this issue outside my local configuration. I really appreciate you could review the code anyways.
I could fix my issue by basically checking the existence of LOGGER
(i.e. not None
) variable at the subcommand
function from log.py
file
Thank you!
Yes, I can review your code. I'm guessing it's in your repo's master branch?
Yes, it is, but it's fine if you don't review it! You already did a lot by demonstrating that this issue is basically an issue of my local configuration
When I ran a line like this:
I got the following error:
Upon looking the code, I noticed that the variable
LOGGER
is used as a global to generate log files. However,multiprocessing
creates complete new instances without inheriting changes in this global variable. Then, I think the way this variable is sent it into a parallel run needs further review.One solution is basically block log file updates when
atram_preprocessor.py
is running in parallel. But I do not think this is a good solution.