AmpliconSuite / AmpliconSuite-pipeline

A quickstart tool for AmpliconArchitect. Performs all preliminary steps (alignment, CNV calling, seed interval detection) required prior to running AmpliconArchitect. Previously called PrepareAA.
Other
53 stars 28 forks source link

TypeError: 'NoneType' object is not callable #16

Closed zhangheng43 closed 2 years ago

zhangheng43 commented 3 years ago

Dear Jens,

I'm trying to use PrepareAA and AA to analyze our WGS data, strating from .fastq files, but I got an error.

I ran PrepareAA/PrepareAA.py -s PGCC1 -t 40 --canvas_dir ~/Canvas-1.40.0.1613+master_x64 --fastqs /media/baiqing/D215server/Data/80-614826108-WGS/N2102812_ZH_80-614826108_MEDSEQ/210326-NovaA/PGCC-DNA1-LDC4348_combined_R1.fastq.gz /media/baiqing/D215server/Data/80-614826108-WGS/N2102812_ZH_80-614826108_MEDSEQ/210326-NovaA/PGCC-DNA1-LDC4348_combined_R2.fastq.gz -o /media/baiqing/D215server/Data/80-614826108-WGS/

and got

Running freebayes Exception in thread Thread-2: Traceback (most recent call last): Exception in thread Thread-4: File "/home/baiqing/enter/lib/python3.8/threading.py", line 932, in _bootstrap_inner File "PrepareAA/PrepareAA.py", line 22, in run self.run() self._target(*self._args) TypeError: 'NoneType' object is not callablee Merging VCFs and zipping Traceback (most recent call last): File "PrepareAA/PrepareAA.py", line 473, in merged_vcf_file = merge_and_filter_vcfs(chr_sizes.keys(), vcf_files, outdir, sname) File "PrepareAA/PrepareAA.py", line 171, in merge_and_filter_vcfs sorted_chr_names = ["chr" + str(x) for x in sorted(numeric_chr_names)] TypeError: '<' not supported between instances of 'str' and 'int'

Could you kindly help me to solve the problem? Thank you!

Best, Heng Zhang

jluebeck commented 3 years ago

Hi Heng,

Thank you for reporting this bug. I have pushed a bugfix for this in 1820385, so please let me know if any issues persist.

Best regards, Jens

zhangheng43 commented 3 years ago

Dear Jens,

Thank you for your timely reply and bugfix. I updated my PrepareAA at once and reran the previous command, and got

Running freebayes Exception in thread Thread-2: Traceback (most recent call last): File "/home/baiqing/enter/lib/python3.8/threading.py", line 932, in _bootstrap_inner Exception in thread Thread-16: self.run() File "PrepareAA/PrepareAA.py", line 22, in run self._target(*self._args) self.run() TypeError: 'NoneType' object is not callable Merging VCFs and zipping Traceback (most recent call last): File "PrepareAA/PrepareAA.py", line 502, in merged_vcf_file = merge_and_filter_vcfs(chr_sizes.keys(), vcf_files, outdir, sname) File "PrepareAA/PrepareAA.py", line 172, in merge_and_filter_vcfs call("zcat " + chrom_vcf_d["chrM"] + " | awk '$4 != \"N\"' > " + merged_vcf_file, shell=True) KeyError: 'chrM'

I wonder if there are problems in "threading.py". Could you help me?

Yours sincerely, Heng

jluebeck commented 3 years ago

Hi Heng,

Thanks again for pointing this out. It's an issue with how chrM (or "MT" for GRCh37) is handled in cases where the AA data repo is used for alignment. I've added a fix in 1445a97 which should resolve the issue. Sorry for the inconvenience and please do let me know if issues persist.

Best regards, Jens

zhangheng43 commented 3 years ago

Dear Jens,

Thank you for your reply and bugfix! I tried again and got

Running freebayes Exception in thread Thread-2: Traceback (most recent call last): File "/home/baiqing/enter/lib/python3.8/threading.py", line 932, in _bootstrap_inner self.run() File "PrepareAA/PrepareAA.py", line 22, in run self._target(*self._args) TypeError: 'NoneType' object is not callable Merging VCFs and zipping Traceback (most recent call last): File "PrepareAA/PrepareAA.py", line 504, in merged_vcf_file = merge_and_filter_vcfs(chr_sizes.keys(), vcf_files, outdir, sname) File "PrepareAA/PrepareAA.py", line 184, in merge_and_filter_vcfs call("zcat " + chrom_vcf_d[i + "p"] + " | grep -v \"^#\" | awk '$4 != \"N\"' >> " + merged_vcf_file, shell=True) KeyError: 'chr1p'

Could you help to resolve the issue?

Yours sincerely, Heng

jluebeck commented 3 years ago

Hi Heng,

Can you confirm that Freebayes is installed (and can be called from the command line without specifying the path), and can you confirm that PrepareAA then generated a collection of VCF files? A screenshot of the files PrepareAA generated (especially the VCFs) would be helpful as I assess what happened.

Thank you again for reporting these issues, it's very helpful. Jens

zhangheng43 commented 3 years ago

Dear Jens,

I confirm that freebayes can be called from the command line. (Here is a screenshot)

2021-07-21 21-17-30屏幕截图

However, it seemed that PrepareAA did not generate VCF files. Here is the screenshot of the folder PrepareAA generated.

2021-07-21 21-23-21屏幕截图

Thanks again for your patient and meticulous help! I am looking forward to your assessment.

Best regards, Heng

jluebeck commented 3 years ago

Hi Heng, thank you for providing these, it's very helpful. I am wondering if you would be able to provide a listing of the contents of the "freebayes_vcfs" and "canvas_output" directories in your screenshot above?

I will separately try to recreate this issue on my end to get it fixed sooner.

Thank you! Jens

prashanthid commented 3 years ago

Hello Jens,

I am stuck at the step and with the same error. PrepareAA did not generate any VCF files in freebayes_vcfs folder.

Thank you for your help! Prashanthi

zhangheng43 commented 3 years ago

Dear Jens,

The two folders "freebayes_vcfs" and "canvas_output" are empty. Here are the screenshots.

Screenshot from 2021-07-22 10-54-41 Screenshot from 2021-07-22 10-55-02

I hope them help. Thank you!

Yours, Heng

jluebeck commented 3 years ago

Thank you @zhangheng43 and @prashanthid,

I am not able to recreate this issue on my end, which is strange. However, I think this may be related to the freebayes installation location. Please note that if freebayes is set by the .bashrc script or similar - perhaps aliasing the command, it may not be on the system path (e.g. /usr/local/bin/freebayes is on the path), and thus when python calls for freebayes, it won't be found. I have added an optional argument allowing users to specify a custom path to freebayes --freebayes_dir into 9550c07 and if you provide the path to the directory where the freebayes executable resides, the issue you both are encountering should be resolved.

I personally recommend users use CNVKit for seeding PrepareAA instead of Canvas. CNVKit performs better on low coverage BAM files (that is, it won't crash), and Canvas caps the maximum CN it will report at CN=10, which may be somewhat misleading for downstream analysis when the true CN exceeds that. Feel free to try CNVKit instead.

Thanks, Jens

prashanthid commented 3 years ago

Hi Jens,

Thank you for your suggestion. Seeding using CNVKit works perfectly fine!

Regards, Prashanthi

zhangheng43 commented 3 years ago

Dear Jens,

Thanks for your helpful suggestion. I tried to use CNVKit for seeding PrepareAA by running

PrepareAA/PrepareAA.py -s PGCC1 -t 40 --cnvkit_dir cnvkit/cnvkit.py --sorted_bam /media/heng/D215server/Data/ZH_80-614826108_WGS/Out/PGCC1.cs.rmdup.bam -o /media/heng/D215server/Data/ZH_80-614826108_WGS/paaOut --ref "GRCh38"

and got

2021-08-04 14:33:55.849314 Running PrepareAA on sample: PGCC1 Running CNVKit batch python3 cnvkit/cnvkit.py batch -m wgs -r /home/heng/ZhangHeng/data_repo/GRCh38/GRCh38_cnvkit_filtered_ref.cnn -p 40 -d /media/heng/D215server/Data/ZH_80-614826108_WGS/paaOut/cnvkit_output/ /media/heng/D215server/Data/ZH_80-614826108_WGS/Out/PGCC1.cs.rmdup.bam CNVkit 0.9.10.dev0 Wrote /media/heng/D215server/Data/ZH_80-614826108_WGS/paaOut/cnvkit_output/GRCh38_cnvkit_filtered_ref.target-tmp.bed with 568558 regions Wrote /media/heng/D215server/Data/ZH_80-614826108_WGS/paaOut/cnvkit_output/GRCh38_cnvkit_filtered_ref.antitarget-tmp.bed with 0 regions Running 1 samples in 40 processes (that's 40 processes per bam) Running the CNVkit pipeline on /media/heng/D215server/Data/ZH_80-614826108_WGS/Out/PGCC1.cs.rmdup.bam ... Processing reads in PGCC1.cs.rmdup.bam Time: 621.456 seconds (1004758 reads/sec, 915 bins/sec) Summary: #bins=568558, #reads=624412908, mean=1098.2396, min=0.0, max=24657.446666666667 Percent reads in regions: 88.564 (of 705038410 mapped) Wrote /media/heng/D215server/Data/ZH_80-614826108_WGS/paaOut/cnvkit_output/PGCC1.cs.rmdup.targetcoverage.cnn with 568558 regions Skip processing PGCC1.cs.rmdup.bam with empty regions file /media/heng/D215server/Data/ZH_80-614826108_WGS/paaOut/cnvkit_output/GRCh38_cnvkit_filtered_ref.antitarget-tmp.bed Wrote /media/heng/D215server/Data/ZH_80-614826108_WGS/paaOut/cnvkit_output/PGCC1.cs.rmdup.antitargetcoverage.cnn with 0 regions Processing target: PGCC1.cs.rmdup Keeping 566505 of 568558 bins Correcting for GC bias... Processing antitarget: PGCC1.cs.rmdup Wrote /media/heng/D215server/Data/ZH_80-614826108_WGS/paaOut/cnvkit_output/PGCC1.cs.rmdup.cnr with 566505 regions Segmenting /media/heng/D215server/Data/ZH_80-614826108_WGS/paaOut/cnvkit_output/PGCC1.cs.rmdup.cnr ... Segmenting with method 'cbs', significance threshold 1e-06, in 40 processes Smoothing overshot at 1 / 1974 indices: (-18.15890346023935, 0.9082211799254036) vs. original (-24.268131488562844, 0.8635337735507143) Post-processing /media/heng/D215server/Data/ZH_80-614826108_WGS/paaOut/cnvkit_output/PGCC1.cs.rmdup.cns ... Wrote /media/heng/D215server/Data/ZH_80-614826108_WGS/paaOut/cnvkit_output/PGCC1.cs.rmdup.cns with 1227 regions Applying filter 'ci' Filtered by 'ci' from 1227 to 457 rows Calling copy number with thresholds: -1.1 => 0, -0.25 => 1, 0.2 => 2, 0.7 => 3 Wrote /media/heng/D215server/Data/ZH_80-614826108_WGS/paaOut/cnvkit_output/PGCC1.cs.rmdup.call.cns with 457 regions Significant hits in 4217/566505 bins (0.744%) Wrote /media/heng/D215server/Data/ZH_80-614826108_WGS/paaOut/cnvkit_output/PGCC1.cs.rmdup.bintest.cns with 4217 regions

Running CNVKIt segment python3 cnvkit/cnvkit.py segment /media/heng/D215server/Data/ZH_80-614826108_WGS/paaOut/cnvkit_output/PGCC1.cs.rmdup.cnr -p 40 -o /media/heng/D215server/Data/ZH_80-614826108_WGS/paaOut/cnvkit_output/PGCC1.cs.rmdup.cns Segmenting with method 'cbs', significance threshold 0.0001, in 40 processes Smoothing overshot at 1 / 1974 indices: (-18.158880744473954, 0.908221429071056) vs. original (-24.2681, 0.863534) Wrote /media/heng/D215server/Data/ZH_80-614826108_WGS/paaOut/cnvkit_output/PGCC1.cs.rmdup.cns with 1340 regions

Running amplified_intervals python2 /home/heng/ZhangHeng/AmpliconArchitect/src/amplified_intervals.py --ref GRCh38 --bed /media/heng/D215server/Data/ZH_80-614826108_WGS/paaOut/cnvkit_output/PGCC1.cs.rmdup_CNV_GAIN.bed --bam /media/heng/D215server/Data/ZH_80-614826108_WGS/Out/PGCC1.cs.rmdup.bam --gain 4.5 --cnsize_min 50000 --out /media/heng/D215server/Data/ZH_80-614826108_WGS/paaOut/PGCC1_AA_CNV_SEEDS Global ref name is GRCh38 Traceback (most recent call last): File "/home/heng/ZhangHeng/AmpliconArchitect/src/amplified_intervals.py", line 94, in import bam_to_breakpoint as b2b File "/home/heng/ZhangHeng/AmpliconArchitect/src/bam_to_breakpoint.py", line 35, in import matplotlib File "/usr/local/lib/python2.7/dist-packages/matplotlib/init.py", line 133, in from matplotlib.rcsetup import defaultParams, validate_backend, cycler File "/usr/local/lib/python2.7/dist-packages/matplotlib/rcsetup.py", line 31, in from matplotlib.fontconfig_pattern import parse_fontconfig_pattern File "/usr/local/lib/python2.7/dist-packages/matplotlib/fontconfig_pattern.py", line 28, in from backports.functools_lru_cache import lru_cache ImportError: No module named functools_lru_cache Completed

It seemed very close to success. I read "readme.md" again and tried to install the missing package by running sudo apt-get install build-essential python-dev gfortran python-numpy python-scipy python-matplotlib python-pip zlib1g-dev samtools and got

Reading package lists... Done Building dependency tree
Reading state information... Done Note, selecting 'python-dev-is-python2' instead of 'python-dev' Package python-matplotlib is not available, but is referred to by another package. This may mean that the package is missing, has been obsoleted, or is only available from another source

Package python-scipy is not available, but is referred to by another package. This may mean that the package is missing, has been obsoleted, or is only available from another source

Package python-pip is not available, but is referred to by another package. This may mean that the package is missing, has been obsoleted, or is only available from another source However the following packages replace it: python3-pip

E: Package 'python-scipy' has no installation candidate E: Package 'python-matplotlib' has no installation candidate E: Package 'python-pip' has no installation candidate

Could you help me to solve the problem? Thank you!

Yours, Heng

jluebeck commented 3 years ago

Hi Heng,

Apologies, those instructions do not reflect changes to python's pip system. Please follow the instructions here: https://github.com/virajbdeshpande/AmpliconArchitect/blob/master/README.md#prerequisites-2.

I will update the instructions on my fork of AA to reflect this. Please let me know if issues persist with installing the packages.

Jens

zhangheng43 commented 3 years ago

Hello Jens,

I ran these commands you mentioned. It seemed that the prerequisites had been installed, but I got an issue about pip.

When I run sudo python2 get-pip.py, I got

ERROR: This script does not work on Python 2.7 The minimum supported Python version is 3.6. Please use https://bootstrap.pypa.io/pip/2.7/get-pip.py instead.

Then I ran wget http://bootstrap.pypa.io/pip/2.7/get-pip.py sudo python2 get-pip.py.1, and got

DEPRECATION: Python 2.7 reached the end of its life on January 1st, 2020. Please upgrade your Python as Python 2.7 is no longer maintained. pip 21.0 will drop support for Python 2.7 in January 2021. More details about Python 2 support in pip can be found at https://pip.pypa.io/en/latest/development/release-process/#python-2-support pip 21.0 will remove support for this functionality. Collecting pip<21.0 WARNING: Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<pip._vendor.urllib3.connection.HTTPSConnection object at 0x7f3623ef1390>: Failed to establish a new connection: [Errno 101] Network is unreachable',)': /packages/27/79/8a850fe3496446ff0d584327ae44e7500daf6764ca1a382d2d02789accf7/pip-20.3.4-py2.py3-none-any.whl WARNING: Retrying (Retry(total=3, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<pip._vendor.urllib3.connection.HTTPSConnection object at 0x7f3623ef1690>: Failed to establish a new connection: [Errno 101] Network is unreachable',)': /packages/27/79/8a850fe3496446ff0d584327ae44e7500daf6764ca1a382d2d02789accf7/pip-20.3.4-py2.py3-none-any.whl WARNING: Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<pip._vendor.urllib3.connection.HTTPSConnection object at 0x7f3623ef1a90>: Failed to establish a new connection: [Errno 101] Network is unreachable',)': /packages/27/79/8a850fe3496446ff0d584327ae44e7500daf6764ca1a382d2d02789accf7/pip-20.3.4-py2.py3-none-any.whl Downloading pip-20.3.4-py2.py3-none-any.whl (1.5 MB) |████████████████████████████████| 1.5 MB 892 kB/s Collecting wheel Downloading wheel-0.36.2-py2.py3-none-any.whl (35 kB) Installing collected packages: pip, wheel Attempting uninstall: pip Found existing installation: pip 9.0.1 Uninstalling pip-9.0.1: Successfully uninstalled pip-9.0.1 Successfully installed pip-20.3.4 wheel-0.36.2

sudo pip2 install pysam Flask Cython numpy scipy matplotlib

DEPRECATION: Python 2.7 reached the end of its life on January 1st, 2020. Please upgrade your Python as Python 2.7 is no longer maintained. pip 21.0 will drop support for Python 2.7 in January 2021. More details about Python 2 support in pip can be found at https://pip.pypa.io/en/latest/development/release-process/#python-2-support pip 21.0 will remove support for this functionality. Requirement already satisfied: pysam in /usr/local/lib/python2.7/dist-packages (0.16.0.1) Requirement already satisfied: Flask in /usr/local/lib/python2.7/dist-packages (1.1.4) Requirement already satisfied: Cython in /usr/local/lib/python2.7/dist-packages (0.29.24) Requirement already satisfied: numpy in /usr/local/lib/python2.7/dist-packages (1.16.6) Requirement already satisfied: scipy in /usr/local/lib/python2.7/dist-packages (1.0.0) Requirement already satisfied: matplotlib in /usr/local/lib/python2.7/dist-packages (2.2.5) Requirement already satisfied: itsdangerous<2.0,>=0.24 in /usr/local/lib/python2.7/dist-packages (from Flask) (1.1.0) Requirement already satisfied: Jinja2<3.0,>=2.10.1 in /usr/local/lib/python2.7/dist-packages (from Flask) (2.11.3) Requirement already satisfied: Werkzeug<2.0,>=0.15 in /usr/local/lib/python2.7/dist-packages (from Flask) (1.0.1) Requirement already satisfied: click<8.0,>=5.1 in /usr/local/lib/python2.7/dist-packages (from Flask) (7.1.2) Requirement already satisfied: six>=1.10 in /usr/local/lib/python2.7/dist-packages (from matplotlib) (1.16.0) Requirement already satisfied: pytz in /usr/local/lib/python2.7/dist-packages (from matplotlib) (2021.1) Requirement already satisfied: backports.functools-lru-cache in /usr/local/lib/python2.7/dist-packages (from matplotlib) (1.6.4) Requirement already satisfied: python-dateutil>=2.1 in /usr/local/lib/python2.7/dist-packages (from matplotlib) (2.8.2) Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python2.7/dist-packages (from matplotlib) (0.10.0) Requirement already satisfied: kiwisolver>=1.0.1 in /usr/local/lib/python2.7/dist-packages (from matplotlib) (1.1.0) Requirement already satisfied: subprocess32 in /usr/lib/python2.7/dist-packages (from matplotlib) (3.5.4) Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.1 in /usr/local/lib/python2.7/dist-packages (from matplotlib) (2.4.7) Requirement already satisfied: MarkupSafe>=0.23 in /usr/local/lib/python2.7/dist-packages (from Jinja2<3.0,>=2.10.1->Flask) (1.1.1) Requirement already satisfied: setuptools in /usr/local/lib/python2.7/dist-packages (from kiwisolver>=1.0.1->matplotlib) (44.1.1)

However, when I ran PrepareAA as before, I got the same issue

Running amplified_intervals python2 /home/heng/ZhangHeng/AmpliconArchitect/src/amplified_intervals.py --ref GRCh38 --bed /media/heng/D215server/Data/ZH_80-614826108_WGS/paaOut/cnvkit_output/PGCC1.cs.rmdup_CNV_GAIN.bed --bam /media/heng/D215server/Data/ZH_80-614826108_WGS/Out/PGCC1.cs.rmdup.bam --gain 4.5 --cnsize_min 50000 --out /media/heng/D215server/Data/ZH_80-614826108_WGS/paaOut/PGCC1_AA_CNV_SEEDS Global ref name is GRCh38 Traceback (most recent call last): File "/home/heng/ZhangHeng/AmpliconArchitect/src/amplified_intervals.py", line 94, in import bam_to_breakpoint as b2b File "/home/heng/ZhangHeng/AmpliconArchitect/src/bam_to_breakpoint.py", line 35, in import matplotlib File "/usr/local/lib/python2.7/dist-packages/matplotlib/init.py", line 133, in from matplotlib.rcsetup import defaultParams, validate_backend, cycler File "/usr/local/lib/python2.7/dist-packages/matplotlib/rcsetup.py", line 31, in from matplotlib.fontconfig_pattern import parse_fontconfig_pattern File "/usr/local/lib/python2.7/dist-packages/matplotlib/fontconfig_pattern.py", line 28, in from backports.functools_lru_cache import lru_cache ImportError: No module named functools_lru_cache Completed

I think this should not be a problem of PrepareAA, but a problem of AA. I wonder which package is missing. How should I deal with it?

Thank you again for your help!

Yours, Heng

jluebeck commented 3 years ago

Hi Heng,

This appears to be an installation or versioning issue with Matplotlib. Can you try some of the solutions listed here? https://stackoverflow.com/questions/47179433/python-2-7-functools-lru-cache-does-not-import-although-installed

Thanks, Jens

zhangheng43 commented 3 years ago

Dear Jens,

Thank you very much, and I have run it successfully!

Yours, Heng