christopher-vollmers / Mandalorion

Pipeline to identify isoforms from full-length cDNA sequencing data
MIT License
22 stars 5 forks source link

segfault in alignment module #1

Open pdoris opened 9 months ago

pdoris commented 9 months ago

My command is

python3 Mando.py -p /work2/06127/pdoris/frontera/Mandalorion/Mandalorion/ -g UCSC2GRCr8.sorted.gtf -G GRCr8.fasta -f Consensus_reads.fofn -M A


Mandalorion - Isoform identification and quantification
Version - v4.3.0 - That's no moon. It's an isoform


Module A - Alignment           

checking input fasta/q files
combining 18 input fasta/q files

this produces a very big combined fasta file. 35541118333 Feb 5 16:39 Combined.fasta

sh: line 1: 36889 Segmentation fault /work2/06127/pdoris/frontera/Mandalorion/Mandalorion/minimap2/minimap2 -G 400k --secondary=no -ax splice:hq --cs=long -uf -t 8 GRCr8.fasta /work2/06127/pdoris/frontera/Mandalorion/Mandalorion///tmp//Combined.fasta > /work2/06127/pdoris/frontera/Mandalorion/Mandalorion///tmp//mm2Alignments.sam

christopher-vollmers commented 9 months ago

Hi,

That error suggests that you are running out of RAM running minimap2. Can I ask how much RAM you have on the machine you are using? RAM usage of minimap2 (as I understand it) depends on the genome reference size so it's pretty much fixed and most often >30GB.

pdoris commented 9 months ago

Thanks, increased memory, problem solved....until I reach the P module.

Looks like a spawn error....Mac OS 14.2 py 3.9.18, also tried with pyenv 3.8.18, some result


Mandalorion - Isoform identification and quantification
Version - v4.3.0 - That's no moon. It's an isoform


Module P - .sam to .clean.sorted.psl conversion           

converting sam output to psl format

Traceback (most recent call last):0000
File "", line 1, in File "/Users/pdoris/.pyenv/versions/3.8.18/lib/python3.8/multiprocessing/spawn.py", line 116, in spawn_main exitcode = _main(fd, parent_sentinel) File "/Users/pdoris/.pyenv/versions/3.8.18/lib/python3.8/multiprocessing/spawn.py", line 125, in _main prepare(preparation_data) File "/Users/pdoris/.pyenv/versions/3.8.18/lib/python3.8/multiprocessing/spawn.py", line 236, in prepare _fixup_main_from_path(data['init_main_from_path']) File "/Users/pdoris/.pyenv/versions/3.8.18/lib/python3.8/multiprocessing/spawn.py", line 287, in _fixup_main_from_path main_content = runpy.run_path(main_path, File "/Users/pdoris/.pyenv/versions/3.8.18/lib/python3.8/runpy.py", line 265, in run_path Traceback (most recent call last):0000
File "", line 1, in File "/Users/pdoris/.pyenv/versions/3.8.18/lib/python3.8/multiprocessing/spawn.py", line 116, in spawn_main exitcode = _main(fd, parent_sentinel) File "/Users/pdoris/.pyenv/versions/3.8.18/lib/python3.8/multiprocessing/spawn.py", line 125, in _main prepare(preparation_data) File "/Users/pdoris/.pyenv/versions/3.8.18/lib/python3.8/multiprocessing/spawn.py", line 236, in prepare _fixup_main_from_path(data['init_main_from_path']) File "/Users/pdoris/.pyenv/versions/3.8.18/lib/python3.8/multiprocessing/spawn.py", line 287, in _fixup_main_from_path main_content = runpy.run_path(main_path, File "/Users/pdoris/.pyenv/versions/3.8.18/lib/python3.8/runpy.py", line 265, in run_path Traceback (most recent call last):0000
File "", line 1, in File "/Users/pdoris/.pyenv/versions/3.8.18/lib/python3.8/multiprocessing/spawn.py", line 116, in spawn_main exitcode = _main(fd, parent_sentinel) File "/Users/pdoris/.pyenv/versions/3.8.18/lib/python3.8/multiprocessing/spawn.py", line 125, in _main prepare(preparation_data) File "/Users/pdoris/.pyenv/versions/3.8.18/lib/python3.8/multiprocessing/spawn.py", line 236, in prepare _fixup_main_from_path(data['init_main_from_path']) File "/Users/pdoris/.pyenv/versions/3.8.18/lib/python3.8/multiprocessing/spawn.py", line 287, in _fixup_main_from_path main_content = runpy.run_path(main_path, File "/Users/pdoris/.pyenv/versions/3.8.18/lib/python3.8/runpy.py", line 265, in run_path return _run_module_code(code, init_globals, run_name, return _run_module_code(code, init_globals, run_name, File "/Users/pdoris/.pyenv/versions/3.8.18/lib/python3.8/runpy.py", line 97, in _run_module_code File "/Users/pdoris/.pyenv/versions/3.8.18/lib/python3.8/runpy.py", line 97, in _run_module_code return _run_module_code(code, init_globals, run_name, File "/Users/pdoris/.pyenv/versions/3.8.18/lib/python3.8/runpy.py", line 97, in _run_module_code _run_code(code, mod_globals, init_globals, File "/Users/pdoris/.pyenv/versions/3.8.18/lib/python3.8/runpy.py", line 87, in _run_code _run_code(code, mod_globals, init_globals, File "/Users/pdoris/.pyenv/versions/3.8.18/lib/python3.8/runpy.py", line 87, in _run_code exec(code, run_globals) File "/Users/pdoris/Mandalorion/emtrey.py", line 202, in exec(code, run_globals) File "/Users/pdoris/Mandalorion/emtrey.py", line 202, in _run_code(code, mod_globals, init_globals, File "/Users/pdoris/.pyenv/versions/3.8.18/lib/python3.8/runpy.py", line 87, in _run_code main() File "/Users/pdoris/Mandalorion/emtrey.py", line 199, in main main() File "/Users/pdoris/Mandalorion/emtrey.py", line 199, in main exec(code, run_globals) File "/Users/pdoris/Mandalorion/emtrey.py", line 202, in delegatingPslConversion(inFile,batch) File "/Users/pdoris/Mandalorion/emtrey.py", line 185, in delegatingPslConversion main() File "/Users/pdoris/Mandalorion/emtrey.py", line 199, in main processSamBatch(reads,chromosomes) File "/Users/pdoris/Mandalorion/emtrey.py", line 156, in processSamBatch delegatingPslConversion(inFile,batch) pool = mp.Pool(processes=threads) File "/Users/pdoris/Mandalorion/emtrey.py", line 185, in delegatingPslConversion File "/Users/pdoris/.pyenv/versions/3.8.18/lib/python3.8/multiprocessing/context.py", line 119, in Pool delegatingPslConversion(inFile,batch) File "/Users/pdoris/Mandalorion/emtrey.py", line 185, in delegatingPslConversion processSamBatch(reads,chromosomes) File "/Users/pdoris/Mandalorion/emtrey.py", line 156, in processSamBatch pool = mp.Pool(processes=threads) processSamBatch(reads,chromosomes) File "/Users/pdoris/Mandalorion/emtrey.py", line 156, in processSamBatch pool = mp.Pool(processes=threads) File "/Users/pdoris/.pyenv/versions/3.8.18/lib/python3.8/multiprocessing/context.py", line 119, in Pool File "/Users/pdoris/.pyenv/versions/3.8.18/lib/python3.8/multiprocessing/context.py", line 119, in Pool return Pool(processes, initializer, initargs, maxtasksperchild, File "/Users/pdoris/.pyenv/versions/3.8.18/lib/python3.8/multiprocessing/pool.py", line 212, in init return Pool(processes, initializer, initargs, maxtasksperchild, File "/Users/pdoris/.pyenv/versions/3.8.18/lib/python3.8/multiprocessing/pool.py", line 212, in init return Pool(processes, initializer, initargs, maxtasksperchild, File "/Users/pdoris/.pyenv/versions/3.8.18/lib/python3.8/multiprocessing/pool.py", line 212, in init self._repopulate_pool() File "/Users/pdoris/.pyenv/versions/3.8.18/lib/python3.8/multiprocessing/pool.py", line 303, in _repopulate_pool return self._repopulate_pool_static(self._ctx, self.Process, File "/Users/pdoris/.pyenv/versions/3.8.18/lib/python3.8/multiprocessing/pool.py", line 326, in _repopulate_pool_static self._repopulate_pool() File "/Users/pdoris/.pyenv/versions/3.8.18/lib/python3.8/multiprocessing/pool.py", line 303, in _repopulate_pool w.start() File "/Users/pdoris/.pyenv/versions/3.8.18/lib/python3.8/multiprocessing/process.py", line 121, in start self._repopulate_pool() return self._repopulate_pool_static(self._ctx, self.Process, File "/Users/pdoris/.pyenv/versions/3.8.18/lib/python3.8/multiprocessing/pool.py", line 326, in _repopulate_pool_static File "/Users/pdoris/.pyenv/versions/3.8.18/lib/python3.8/multiprocessing/pool.py", line 303, in _repopulate_pool w.start() File "/Users/pdoris/.pyenv/versions/3.8.18/lib/python3.8/multiprocessing/process.py", line 121, in start return self._repopulate_pool_static(self._ctx, self.Process, File "/Users/pdoris/.pyenv/versions/3.8.18/lib/python3.8/multiprocessing/pool.py", line 326, in _repopulate_pool_static w.start() File "/Users/pdoris/.pyenv/versions/3.8.18/lib/python3.8/multiprocessing/process.py", line 121, in start self._popen = self._Popen(self)000
File "/Users/pdoris/.pyenv/versions/3.8.18/lib/python3.8/multiprocessing/context.py", line 284, in _Popen self._popen = self._Popen(self) File "/Users/pdoris/.pyenv/versions/3.8.18/lib/python3.8/multiprocessing/context.py", line 284, in _Popen return Popen(process_obj) File "/Users/pdoris/.pyenv/versions/3.8.18/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 32, in init return Popen(process_obj) File "/Users/pdoris/.pyenv/versions/3.8.18/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 32, in init self._popen = self._Popen(self) File "/Users/pdoris/.pyenv/versions/3.8.18/lib/python3.8/multiprocessing/context.py", line 284, in _Popen return Popen(process_obj) File "/Users/pdoris/.pyenv/versions/3.8.18/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 32, in init super().init(process_obj)00000
File "/Users/pdoris/.pyenv/versions/3.8.18/lib/python3.8/multiprocessing/popen_fork.py", line 19, in init super().init(process_obj) File "/Users/pdoris/.pyenv/versions/3.8.18/lib/python3.8/multiprocessing/popen_fork.py", line 19, in init super().init(process_obj) File "/Users/pdoris/.pyenv/versions/3.8.18/lib/python3.8/multiprocessing/popen_fork.py", line 19, in init self._launch(process_obj) File "/Users/pdoris/.pyenv/versions/3.8.18/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 42, in _launch self._launch(process_obj) File "/Users/pdoris/.pyenv/versions/3.8.18/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 42, in _launch prep_data = spawn.get_preparation_data(process_obj._name) File "/Users/pdoris/.pyenv/versions/3.8.18/lib/python3.8/multiprocessing/spawn.py", line 154, in get_preparation_data _check_not_importing_main() File "/Users/pdoris/.pyenv/versions/3.8.18/lib/python3.8/multiprocessing/spawn.py", line 134, in _check_not_importing_main prep_data = spawn.get_preparation_data(process_obj._name) File "/Users/pdoris/.pyenv/versions/3.8.18/lib/python3.8/multiprocessing/spawn.py", line 154, in get_preparation_data self._launch(process_obj) File "/Users/pdoris/.pyenv/versions/3.8.18/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 42, in _launch raise RuntimeError(''' RuntimeError: _check_not_importing_main() File "/Users/pdoris/.pyenv/versions/3.8.18/lib/python3.8/multiprocessing/spawn.py", line 134, in _check_not_importing_main An attempt has been made to start a new process before the current process has finished its bootstrapping phase.

    This probably means that you are not using fork to start your
    child processes and you have forgotten to use the proper idiom
    in the main module:

        if __name__ == '__main__':
            freeze_support()
            ...

    The "freeze_support()" line can be omitted if the program
    is not going to be frozen to produce an executable.
raise RuntimeError('''

RuntimeError: An attempt has been made to start a new process before the current process has finished its bootstrapping phase.

    This probably means that you are not using fork to start your
    child processes and you have forgotten to use the proper idiom
    in the main module:

        if __name__ == '__main__':
            freeze_support()
            ...

    The "freeze_support()" line can be omitted if the program
    is not going to be frozen to produce an executable.
prep_data = spawn.get_preparation_data(process_obj._name)

File "/Users/pdoris/.pyenv/versions/3.8.18/lib/python3.8/multiprocessing/spawn.py", line 154, in get_preparation_data _check_not_importing_main() File "/Users/pdoris/.pyenv/versions/3.8.18/lib/python3.8/multiprocessing/spawn.py", line 134, in _check_not_importing_main raise RuntimeError(''' RuntimeError: An attempt has been made to start a new process before the current process has finished its bootstrapping phase.

    This probably means that you are not using fork to start your
    child processes and you have forgotten to use the proper idiom
    in the main module:

        if __name__ == '__main__':
            freeze_support()
            ...

    The "freeze_support()" line can be omitted if the program
    is not going to be frozen to produce an executable.

Traceback (most recent call last): File "", line 1, in File "/Users/pdoris/.pyenv/versions/3.8.18/lib/python3.8/multiprocessing/spawn.py", line 116, in spawn_main exitcode = _main(fd, parent_sentinel) File "/Users/pdoris/.pyenv/versions/3.8.18/lib/python3.8/multiprocessing/spawn.py", line 125, in _main prepare(preparation_data) File "/Users/pdoris/.pyenv/versions/3.8.18/lib/python3.8/multiprocessing/spawn.py", line 236, in prepare _fixup_main_from_path(data['init_main_from_path']) File "/Users/pdoris/.pyenv/versions/3.8.18/lib/python3.8/multiprocessing/spawn.py", line 287, in _fixup_main_from_path main_content = runpy.run_path(main_path, File "/Users/pdoris/.pyenv/versions/3.8.18/lib/python3.8/runpy.py", line 265, in run_path return _run_module_code(code, init_globals, run_name, File "/Users/pdoris/.pyenv/versions/3.8.18/lib/python3.8/runpy.py", line 97, in _run_module_code _run_code(code, mod_globals, init_globals, File "/Users/pdoris/.pyenv/versions/3.8.18/lib/python3.8/runpy.py", line 87, in _run_code exec(code, run_globals) File "/Users/pdoris/Mandalorion/emtrey.py", line 202, in main() File "/Users/pdoris/Mandalorion/emtrey.py", line 199, in main delegatingPslConversion(inFile,batch) File "/Users/pdoris/Mandalorion/emtrey.py", line 185, in delegatingPslConversion processSamBatch(reads,chromosomes) File "/Users/pdoris/Mandalorion/emtrey.py", line 156, in processSamBatch pool = mp.Pool(processes=threads) File "/Users/pdoris/.pyenv/versions/3.8.18/lib/python3.8/multiprocessing/context.py", line 119, in Pool return Pool(processes, initializer, initargs, maxtasksperchild, File "/Users/pdoris/.pyenv/versions/3.8.18/lib/python3.8/multiprocessing/pool.py", line 212, in init self._repopulate_pool() File "/Users/pdoris/.pyenv/versions/3.8.18/lib/python3.8/multiprocessing/pool.py", line 303, in _repopulate_pool return self._repopulate_pool_static(self._ctx, self.Process, File "/Users/pdoris/.pyenv/versions/3.8.18/lib/python3.8/multiprocessing/pool.py", line 326, in _repopulate_pool_static w.start() File "/Users/pdoris/.pyenv/versions/3.8.18/lib/python3.8/multiprocessing/process.py", line 121, in start self._popen = self._Popen(self) File "/Users/pdoris/.pyenv/versions/3.8.18/lib/python3.8/multiprocessing/context.py", line 284, in _Popen return Popen(process_obj) File "/Users/pdoris/.pyenv/versions/3.8.18/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 32, in init super().init(process_obj) File "/Users/pdoris/.pyenv/versions/3.8.18/lib/python3.8/multiprocessing/popen_fork.py", line 19, in init self._launch(process_obj) File "/Users/pdoris/.pyenv/versions/3.8.18/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 42, in _launch prep_data = spawn.get_preparation_data(process_obj._name) File "/Users/pdoris/.pyenv/versions/3.8.18/lib/python3.8/multiprocessing/spawn.py", line 154, in get_preparation_data _check_not_importing_main() File "/Users/pdoris/.pyenv/versions/3.8.18/lib/python3.8/multiprocessing/spawn.py", line 134, in _check_not_importing_main raise RuntimeError('''

christopher-vollmers commented 9 months ago

That's an error I haven't seen before... What operating system are you using to run this?

It's definitely a multiprocessing error in emtrey.py and I'm researching the error now.

I'm asking because a lot of pages mention this error to be a Windows thing which I have much less experience with...

I'll see whether I can reproduce and fix this somehow. I have some ideas.

pdoris commented 9 months ago

Mac OSX 14.2

This might be helpful to you:

https://stackoverflow.com/questions/60691363/runtimeerrorfreeze-support-on-mac

Thanks,

I have finished some smaller analysis on a Linux server, but that is currently down and my big job was in the queue for three days! So this is a mac problem…and maybe windows too.

Peter

From: christopher-vollmers @.> Date: Friday, February 9, 2024 at 1:38 PM To: christopher-vollmers/Mandalorion @.> Cc: Doris, Peter A @.>, Author @.> Subject: Re: [christopher-vollmers/Mandalorion] segfault in alignment module (Issue #1) External: Increase caution when handling links and attachments.

That's an error I haven't seen before... What operating system are you using to run this?

It's definitely a multiprocessing error in emtrey.py and I'm researching the error now.

I'm asking because a lot of pages mention this error to be a Windows thing which I have much less experience with...

I'll see whether I can reproduce and fix this somehow. I have some ideas.

— Reply to this email directly, view it on GitHubhttps://github.com/christopher-vollmers/Mandalorion/issues/1#issuecomment-1936497616, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AUEQOHF67MVBLQBJLP47RVLYSZ3KFAVCNFSM6AAAAABC26XI6CVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMZWGQ4TONRRGY. You are receiving this because you authored the thread.Message ID: @.***>

christopher-vollmers commented 9 months ago

It's a spawn vs fork issue for sure. We aren't testing this on MacOS so there might be other spots this will break as well... My guess is this will run fine on a Linux machine.

I'll need a few days to figure out how to force fork over spawn in macOS and then find a Mac to test this on.

But there is a lot more multiprocessing happening later in Mandalorion and all of that will break probably as well.

A quick thing to try is to make python3.7 your default python3 and try to run Mandalorion that way for now.

pdoris commented 9 months ago

Doing this now and will let you know

A quick thing to try is to make python3.7 your default python3 and try to run Mandalorion that way for now.

From: christopher-vollmers @.> Date: Friday, February 9, 2024 at 1:56 PM To: christopher-vollmers/Mandalorion @.> Cc: Doris, Peter A @.>, Author @.> Subject: Re: [christopher-vollmers/Mandalorion] segfault in alignment module (Issue #1) External: Increase caution when handling links and attachments.

It's a spawn vs fork issue for sure. We aren't testing this on MacOS so there might be other spots this will break as well... My guess is this will run fine on a Linux machine.

I'll need a few days to figure out how to force fork over spawn in macOS and then find a Mac to test this on.

But there is a lot more multiprocessing happening later in Mandalorion and all of that will break probably as well.

A quick thing to try is to make python3.7 your default python3 and try to run Mandalorion that way for now.

— Reply to this email directly, view it on GitHubhttps://github.com/christopher-vollmers/Mandalorion/issues/1#issuecomment-1936521715, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AUEQOHBEX5QVXSO2DGZH57DYSZ5QDAVCNFSM6AAAAABC26XI6CVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMZWGUZDCNZRGU. You are receiving this because you authored the thread.Message ID: @.***>

pdoris commented 9 months ago

It is running OK under py 3.7.17

@.*** Mandalorion % python3 Mando.py -p results -g UCSC2GRCr8.sorted.gtf -G GRCr8.fasta -f Consensus_reads.fofn -M P


Mandalorion - Isoform identification and quantification

Version - v4.3.0 - That's no moon. It's an isoform



Module P - .sam to .clean.sorted.psl conversion

  converting sam output to psl format

        processing read 300000

From: christopher-vollmers @.> Date: Friday, February 9, 2024 at 1:56 PM To: christopher-vollmers/Mandalorion @.> Cc: Doris, Peter A @.>, Author @.> Subject: Re: [christopher-vollmers/Mandalorion] segfault in alignment module (Issue #1) External: Increase caution when handling links and attachments.

It's a spawn vs fork issue for sure. We aren't testing this on MacOS so there might be other spots this will break as well... My guess is this will run fine on a Linux machine.

I'll need a few days to figure out how to force fork over spawn in macOS and then find a Mac to test this on.

But there is a lot more multiprocessing happening later in Mandalorion and all of that will break probably as well.

A quick thing to try is to make python3.7 your default python3 and try to run Mandalorion that way for now.

— Reply to this email directly, view it on GitHubhttps://github.com/christopher-vollmers/Mandalorion/issues/1#issuecomment-1936521715, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AUEQOHBEX5QVXSO2DGZH57DYSZ5QDAVCNFSM6AAAAABC26XI6CVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMZWGUZDCNZRGU. You are receiving this because you authored the thread.Message ID: @.***>