Closed AndyWangSFU closed 4 years ago
Hi Andy,
Could you try updating to the latest version from the master branch (including the sub-repository for bscall)? If you could then remove any editing .gemBS directory in your working directory, and then re-run the gemBS prepare and gemBS index steps. Let me know if this does not resolve the problem.
Thanks, Simon
p.s., the documentation is not completely up to date with respect to the various index files that are generated. This is something that is on the todo list!
On Wed, Mar 11, 2020 at 1:23 AM Andy Wang notifications@github.com wrote:
Hi GemBS team,
Thanks for designing gemBS this cool methylation calling tool. As it claims to perform better than bis-snp, our lab wants to do a snp calling to see how many SNPs we can get (right now, just test for one sample). We have computational nodes with gemBS 3.0.0 and 3.5.0 version.
I tried both versions of gemBS and they both worked well until the index step. However, for the next step "gemBS call" (I already hold the .bam files, so wanna skip the "map" step), they return different errors:
--- For gemBS 3.5.0, it returns:
$ gemBS call gemBS reference indexes/Homo_sapiens_assembly38.gemBS.ref not found. Run 'gemBS index' or correct configuration file and rerun
This makes me a little bit confused. As I checked gemBS's manual guide, the index step will not create a file with .gemBS.ref extension. Anyhow, then I tried to run "gemBS index" again and got this:
$ gemBS index 2020-03-10 16:29:16,026 ERROR: Process '/usr/lib/python3.6/site-packages/gemBS/bin/samtools' finished with 1 2020-03-10 16:29:16,026 ERROR: [faidx] Could not build fai index indexes/Homo_sapiens_assembly38.gemBS.ref.fai ValueError: Error while making faidx index of gemBS reference
I am not sure on what happened, but I guess there may be an error when I update the cluster node, which causes gemBS failing to load related related packages/dependencies. This needs to be double checked. If you know the reason, I'll appreciate a lot.
--- For gemBS 3.0.0, it returns:
$ gemBS -j test.json call Traceback (most recent call last): File "/usr/bin/gemBS", line 13, in load_entry_point('gemBS==3.0.0', 'console_scripts', 'gemBS')() File "/usr/lib/python3.4/site-packages/gemBS/commands.py", line 157, in gemBS_main instances[args.command].run(args) File "/usr/lib/python3.4/site-packages/gemBS/production.py", line 906, in run if self.conversion != None and self.conversion.lower() == "auto" and not args.concat: AttributeError: 'list' object has no attribute 'lower'
But sometimes, it returns different errors like:
Traceback (most recent call last): File "/usr/bin/gemBS", line 13, in load_entry_point('gemBS==3.0.0', 'console_scripts', 'gemBS')() File "/usr/lib/python3.4/site-packages/gemBS/commands.py", line 157, in gemBS_main instances[args.command].run(args) File "/usr/lib/python3.4/site-packages/gemBS/production.py", line 112, in run
prepareConfiguration(text_metadata=args.text_metadata,configFile=args.config,no_db=args.no_db,dbfile=args.dbfile,output=args.output) File "/usr/lib/python3.4/site-packages/gemBS/init.py", line 455, in prepareConfiguration db.check() File "/usr/lib/python3.4/site-packages/gemBS/database.py", line 115, in check self.check_contigs(sync) File "/usr/lib/python3.4/site-packages/gemBS/database.py", line 289, in check_contigs for fname, pool, smp, ftype, status in c.execute("SELECT * FROM calling"): ValueError: too many values to unpack (expected 5)
These errors are quite confusing. Are those errors only in older versions and have been fixed now?
For the file structure, it is very simple like this:
-
referenc - Homo_sapiens_assembly38.fa
indexes (5 files generated from "gemBS index" step) - Homo_sapiens_assembly38.contig.sizes / 3. dbSNP_gemBS.idx / gem_indexer_Homo_sapiens_assembly38.BS.gem.err / Homo_sapiens_assembly38.BS.gem / Homo_sapiens_assembly38.BS.info
mapping - IX6859.bam
dbsnp (generated from NCBI ftp server) - *.bed.gz
example.csv
- example.conf
I also attached the configuration files (example.csv and example.conf) I used. I think this is a cool tool and I may make a tiny stupid mistake inside. If you could spend little time and help me have a quick look, I will be grateful a lot.
Thanks, Andy Wang
Research Assistant UBC Heart Lung Innovation | Daley Lab Room 166, 1081 Burrard Street, St. Paul's Hospital, Vancouver, B.C. Canada V6Z 1Y6
test.zip https://github.com/heathsc/gemBS/files/4315344/test.zip
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/heathsc/gemBS/issues/71?email_source=notifications&email_token=AAY4657KMWCUDS2KMBLTL5LRG3KXPA5CNFSM4LFKXBKKYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4IUB3F7Q, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAY4657DSEMW5ORUQQRMQN3RG3KXPANCNFSM4LFKXBKA .
Hi Simon,
Many thanks for your quick reply. Some cluster nodes already have the latest version of gemBS (3.5.0). I will try to test and fix the packages linking errors (samtools, bgzip, etc.) as soon as possible if any, then rerun the whole snp calling procedure.
For now, have you seen any obvious errors in my configuration files?
Thanks again, Andy
Hi Simon,
Some updates:
I updated gemBS to the latest version and removed two .gemBS files (they are: Homo_sapiens_assembly38.gemBS.ref Homo_sapiens_assembly38.gemBS.contig_md5).
Then I run the "gemBS prepare..."command:
$ gemBS prepare -c example.conf -t example.csv : gemBS_Reference file 'indexes/Homo_sapiens_assembly38.gemBS.ref': Missing : : To generate missing files run gemBS index
Because the missed .gemBS.ref file is what I deleted just now, everything looks fine until here. Then I ran "gemBS index" command to try to construct this file:
$ gemBS --loglevel debug index 2020-03-12 17:30:37,677 DEBUG: Using bundled binary : /usr/lib/python3.6/site-packages/gemBS/gemBSbinaries/md5_fasta 2020-03-12 17:30:37,679 INFO: Starting: /usr/lib/python3.6/site-packages/gemBS/gemBSbinaries/md5_fasta -o indexes/Homo_sapiens_assembly38.gemBS.contig_md5 -s reference/Homo_sapiens_assembly38.fa | bgzip -@ 8 2020-03-12 17:30:37,679 DEBUG: Starting subprocess 2020-03-12 17:30:37,682 DEBUG: File output detected, opening output stream to indexes/Homo_sapiens_assembly38.gemBS.ref 2020-03-12 17:30:37,684 DEBUG: Setting process input to parent output 2020-03-12 17:30:37,684 DEBUG: Starting subprocess Traceback (most recent call last): File "/usr/bin/gemBS", line 13, in
load_entry_point('gemBS==3.5.0', 'console_scripts', 'gemBS')() File "/usr/lib/python3.6/site-packages/gemBS/commands.py", line 156, in gemBS_main instances[args.command].run(args) File "/usr/lib/python3.6/site-packages/gemBS/production.py", line 181, in run ret = mk_gembs_reference(fasta_input, greference, contig_md5, extra_fasta_files=extra_fasta_files, threads=self.threads, populate_cache=populate_cache) File "/usr/lib/python3.6/site-packages/gemBS/init.py", line 553, in mk_gembs_reference process = run_tools([md5_fasta,bgzip_command], name='md5_fasta', output = greference) File "/usr/lib/python3.6/site-packages/gemBS/utils.py", line 330, in run_tools p.start() File "/usr/lib/python3.6/site-packages/gemBS/utils.py", line 239, in start p.run() File "/usr/lib/python3.6/site-packages/gemBS/utils.py", line 120, in run self.process = subprocess.Popen(self.commands, stdin=stdin, stdout=stdout, stderr=stderr, env=self.env, close_fds=False) File "/usr/lib64/python3.6/subprocess.py", line 729, in init restore_signals, start_new_session) File "/usr/lib64/python3.6/subprocess.py", line 1364, in _execute_child raise child_exception_type(errno_num, err_msg, err_filename) FileNotFoundError: [Errno 2] No such file or directory: 'bgzip': 'bgzip'
It looks like gemBS is trying to use the .fa file to generate the .gemBS.ref file but fails because of an error in bgzip package. Do you think it is the case? I am not quite sure because this error did not occur in gemBS 3.0.0 though. But I will try to install bgzip package, rerun and see. @heathsc
Thanks a lot for your attention.
Best, Andy
Updates: Thanks Simon. The gemBS call is now working. It takes a while to complete. I will see the results tomorrow morning and let you know if any error occurs.
This is odd - bgzip should be automatically installed as part of the package, so something has gone wrong with the installation. Is the bgzip executable in /usr/lib/python3.6/site-packages/gemBS/bin/ ?
Simon
On Fri, Mar 13, 2020 at 1:45 AM Andy Wang notifications@github.com wrote:
Hi Simon,
Some updates:
I updated gemBS to the latest version and removed two .gemBS files (they are: Homo_sapiens_assembly38.gemBS.ref Homo_sapiens_assembly38.gemBS.contig_md5).
Then I run the "gemBS prepare..."command:
$ gemBS prepare -c example.conf -t example.csv : gemBS_Reference file 'indexes/Homo_sapiens_assembly38.gemBS.ref': Missing : : To generate missing files run gemBS index
Because the missed .gemBS.ref file is what I deleted just now, everything looks fine until here. Then I ran "gemBS index" command to try to construct this file:
$ gemBS --loglevel debug index 2020-03-12 17:30:37,677 DEBUG: Using bundled binary : /usr/lib/python3.6/site-packages/gemBS/gemBSbinaries/md5_fasta 2020-03-12 17:30:37,679 INFO: Starting: /usr/lib/python3.6/site-packages/gemBS/gemBSbinaries/md5_fasta -o indexes/Homo_sapiens_assembly38.gemBS.contig_md5 -s reference/Homo_sapiens_assembly38.fa | bgzip -@ 8 2020-03-12 17:30:37,679 DEBUG: Starting subprocess 2020-03-12 17:30:37,682 DEBUG: File output detected, opening output stream to indexes/Homo_sapiens_assembly38.gemBS.ref 2020-03-12 17:30:37,684 DEBUG: Setting process input to parent output 2020-03-12 17:30:37,684 DEBUG: Starting subprocess Traceback (most recent call last): File "/usr/bin/gemBS", line 13, in load_entry_point('gemBS==3.5.0', 'console_scripts', 'gemBS')() File "/usr/lib/python3.6/site-packages/gemBS/commands.py", line 156, in gemBS_main instances[args.command].run(args) File "/usr/lib/python3.6/site-packages/gemBS/production.py", line 181, in run ret = mk_gembs_reference(fasta_input, greference, contig_md5, extra_fasta_files=extra_fasta_files, threads=self.threads, populate_cache=populate_cache) File "/usr/lib/python3.6/site-packages/gemBS/init.py", line 553, in mk_gembs_reference process = run_tools([md5_fasta,bgzip_command], name='md5_fasta', output = greference) File "/usr/lib/python3.6/site-packages/gemBS/utils.py", line 330, in run_tools p.start() File "/usr/lib/python3.6/site-packages/gemBS/utils.py", line 239, in start p.run() File "/usr/lib/python3.6/site-packages/gemBS/utils.py", line 120, in run self.process = subprocess.Popen(self.commands, stdin=stdin, stdout=stdout, stderr=stderr, env=self.env, close_fds=False) File "/usr/lib64/python3.6/subprocess.py", line 729, in init restore_signals, start_new_session) File "/usr/lib64/python3.6/subprocess.py", line 1364, in _execute_child raise child_exception_type(errno_num, err_msg, err_filename) FileNotFoundError: [Errno 2] No such file or directory: 'bgzip': 'bgzip'
It looks like gemBS is trying to use the .fa file to generate the .gemBS.ref file but fails because of an error in bgzip package. Do you think it is the case? I am not quite sure because this error did not occur in gemBS 3.0.0 though. But I will try to install bgzip package, rerun and see. @heathsc https://github.com/heathsc
Thanks a lot for your attention.
Best, Andy
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/heathsc/gemBS/issues/71#issuecomment-598493305, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAY4654Y47WZWYUISUFGN5DRHF6ZXANCNFSM4LFKXBKA .
Yeah. I just added the bgzip library into my working directory and it worked. So far I have successfully run into the gemBS "merge" step. Thank you very much Simon!
Andy
Hi GemBS team,
Thanks for designing gemBS this cool methylation calling tool. As it claims to perform better than bis-snp, our lab wants to do a snp calling to see how many SNPs we can get (right now, just test for one sample). We have computational nodes with gemBS 3.0.0 and 3.5.0 version.
I tried both versions of gemBS and they both worked well until the index step. However, for the next step "gemBS call" (I already hold the .bam files, so wanna skip the "map" step), they return different errors:
--- For gemBS 3.5.0, it returns:
This makes me a little bit confused. As I checked gemBS's manual guide, the index step will not create a file with .gemBS.ref extension. Anyhow, then I tried to run "gemBS index" again and got this:
I am not sure on what happened, but I guess there may be an error when I update the cluster node, which causes gemBS failing to load related related packages/dependencies. This needs to be double checked. If you know the reason, I'll appreciate a lot.
--- For gemBS 3.0.0, it returns:
But sometimes, it returns different errors like:
These errors are quite confusing. Are those errors only in older versions and have been fixed now?
For the file structure, it is very simple with 4 folders and 2 configuration files like this:
I also attached the configuration files (example.csv and example.conf) I used. I think this is a cool tool and I may make a tiny stupid mistake inside. If you could spend little time and help me have a quick look, I will be grateful a lot. @heathsc @MarcosFernandez
Thanks, Andy Wang
Research Assistant UBC Heart Lung Innovation | Daley Lab Room 166, 1081 Burrard Street | St. Paul's Hospital Vancouver, B.C. Canada, V6Z 1Y6
test.zip