Open alexmascension opened 8 months ago
Hi Alex,
Yes, please use a full path (not relative).
What are the contents of your data repo directory /data/Proyectos/NGS_pipeline/database/indexes/GRCm38/aa_data_repo
?
If the data repo files haven't been downloaded into that directory (presumably for GRCm38), then this may cause a bug. You can download the relevant ones here if not obtained already. Also possible that there is a bug with using GRCm38 and the nextflow version of AmpliconSuite.
Thanks, Jens
Hi! My list of files is this:
database/indexes/GRCm38/aa_data_repo/mm10:
annotations file_list.txt mm10-blacklist.v2.bed mm10_conserved_gain5.bed mm10.fa.amb mm10.fa.fai mm10.Hardison.Excludable.full.bed mm10_noAlt.fa.fai
cancer file_sources.txt mm10_centromere.bed mm10_conserved_gain5_onco_subtract.bed mm10.fa.ann mm10.fa.pac mm10_k35.mappability.bedgraph onco_bed.bed
dummy_ploidy.vcf last_updated.txt mm10_cnvkit_filtered_ref.cnn mm10.fa mm10.fa.bwt mm10.fa.sa mm10_merged_centromeres_conserved_sorted.bed
database/indexes/GRCm38/aa_data_repo/mm10/annotations:
gencode.vM10.basic.annotation_genes.gff mm10GenomicSuperDup.tab
database/indexes/GRCm38/aa_data_repo/mm10/cancer:
oncogene_list.txt oncogenes
database/indexes/GRCm38/aa_data_repo/mm10/cancer/oncogenes:
AC_oncogene_set_mm10.gff mm10_consensus_oncogenes_list_from_hg19.gff
I downloaded it from the repository you mentioned.
The most strange thing is that the file /data/Proyectos/NGS_pipeline/database/indexes/GRCh38/aa_data_repo/mm10/file_sources.txt
does exist.
Thanks Alex,
This is very strange - can you try the 1.0.5 beta version of circdna?
If you need a hold-over solution until this is resolved, there are Docker & Singularity images for AmpliconSuite available here.
If you are able to download this small hg19-aligned BAM file from SRA it would be a good test of your installation as well (provided you download the appropriate data repo for this sample as well, e.g. create /data/Proyectos/NGS_pipeline/database/indexes/GRCh38/aa_data_repo/hg19/
).
Hi! Now it seems to work, but I get the following error:
ERROR ~ Error executing process > 'NFCORE_CIRCDNA:CIRCDNA:AMPLICONSUITE (CDNA_3)'
Caused by:
Process `NFCORE_CIRCDNA:CIRCDNA:AMPLICONSUITE (CDNA_3)` terminated with an error exit status (1)
Command executed:
export AA_DATA_REPO=$(echo aa_data_repo)
export MOSEKLM_LICENSE_FILE=$(echo others)
# Define Variables AA_SRC and AC_SRC
export AA_SRC=$(dirname $(python -c "import ampliconarchitectlib; print(ampliconarchitectlib.__file__)"))
export AC_SRC=$(dirname $(which amplicon_classifier.py))
REF=GRCh38
AmpliconSuite-pipeline.py \
\
-s CDNA_3 \
-t 2 \
--bam CDNA_3.md.bam \
--ref GRCh38 \
--run_AA --run_AC \
# Move Files to base work directory
find CDNA_3_cnvkit_output/ -type f -print0 | xargs -0 mv -t ./
find CDNA_3_AA_results/ -type f -print0 | xargs -0 mv -t ./
find CDNA_3_classification/ -type f -print0 | xargs -0 mv -t ./
cat <<-END_VERSIONS > versions.yml
"NFCORE_CIRCDNA:CIRCDNA:AMPLICONSUITE":
AmpliconSuite-pipeline.py: $(AmpliconSuite-pipeline.py --version | sed 's/AmpliconSuite-pipeline version //')
END_VERSIONS
Command exit status:
1
Command output:
(empty)
Command error:
Running AmpliconSuite-pipeline on sample: CDNA_3
CDNA_3.md.bam index not found, calling samtools index
Finished indexing
CDNA_3.md.bam: 3371866 + 0 properly paired (86.08% : N/A)
WARNING: BAM FILE PROPERLY PAIRED RATE IS BELOW 95%.
Quality of data may be insufficient for AA analysis. Poorly controlled insert size distribution during sample prep can cause high fractions of read pairs to be marked as discordant during alignment. Artifactual short SVs and long runtimes may occur!
Running CNVKit batch
python3 /opt/conda/bin/cnvkit.py batch -m wgs -r aa_data_repo/GRCh38/GRCh38_cnvkit_filtered_ref.cnn -p 2 -d CDNA_3_cnvkit_output/ CDNA_3.md.bam
Matplotlib created a temporary cache directory at /tmp/matplotlib-4hlj9weh because the default path (/.config/matplotlib) is not a writable directory; it is highly recommended to set the MPLCONFIGDIR environment variable to a writable directory, in particular to speed up the import of Matplotlib and to better support multiprocessing.
Fontconfig error: No writable cache directories
CNVkit 0.9.10
Wrote CDNA_3_cnvkit_output/GRCh38_cnvkit_filtered_ref.target-tmp.bed with 568558 regions
Wrote CDNA_3_cnvkit_output/GRCh38_cnvkit_filtered_ref.antitarget-tmp.bed with 0 regions
Running 1 samples in 2 processes (that's 2 processes per bam)
Running the CNVkit pipeline on CDNA_3.md.bam ...
Processing reads in CDNA_3.md.bam
Running CNVKit segment
python3 /opt/conda/bin/cnvkit.py segment CDNA_3_cnvkit_output/CDNA_3.md.cnr -p 2 -m cbs -o CDNA_3_cnvkit_output/CDNA_3.md.cns
Matplotlib created a temporary cache directory at /tmp/matplotlib-4xkvob46 because the default path (/.config/matplotlib) is not a writable directory; it is highly recommended to set the MPLCONFIGDIR environment variable to a writable directory, in particular to speed up the import of Matplotlib and to better support multiprocessing.
Fontconfig error: No writable cache directories
Traceback (most recent call last):
File "/opt/conda/bin/cnvkit.py", line 10, in <module>
sys.exit(main())
File "/opt/conda/lib/python3.10/site-packages/cnvlib/cnvkit.py", line 10, in main
args.func(args)
File "/opt/conda/lib/python3.10/site-packages/cnvlib/commands.py", line 986, in _cmd_segment
cnarr = read_cna(args.filename)
File "/opt/conda/lib/python3.10/site-packages/cnvlib/cmdutil.py", line 12, in read_cna
return tabio.read(infile, into=CNA, sample_id=sample_id, meta=meta)
File "/opt/conda/lib/python3.10/site-packages/skgenome/tabio/__init__.py", line 75, in read
dframe = reader(infile, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/skgenome/tabio/tab.py", line 17, in read_tab
dframe = pd.read_csv(infile, sep="\t", dtype={"chromosome": "str"})
File "/opt/conda/lib/python3.10/site-packages/pandas/io/parsers/readers.py", line 948, in read_csv
return _read(filepath_or_buffer, kwds)
File "/opt/conda/lib/python3.10/site-packages/pandas/io/parsers/readers.py", line 611, in _read
parser = TextFileReader(filepath_or_buffer, **kwds)
File "/opt/conda/lib/python3.10/site-packages/pandas/io/parsers/readers.py", line 1448, in __init__
self._engine = self._make_engine(f, self.engine)
File "/opt/conda/lib/python3.10/site-packages/pandas/io/parsers/readers.py", line 1705, in _make_engine
self.handles = get_handle(
File "/opt/conda/lib/python3.10/site-packages/pandas/io/common.py", line 863, in get_handle
handle = open(
FileNotFoundError: [Errno 2] No such file or directory: 'CDNA_3_cnvkit_output/CDNA_3.md.cnr'
CNVKit encountered a non-zero exit status. Exiting...
Work dir:
/data/Proyectos/NGS_pipeline/work/4c/72c545492771caf5e1c531da46300c
Tip: you can try to figure out what's wrong by changing to the process work dir and showing the script file named `.command.sh`
-- Check '.nextflow.log' file for details
I'm not sure if I need to add any parameter to the config of nf-core. Thanks!
Unfortunately, I am not able to reproduce this issue locally when I test the latest version of the tool on a GRCh38-aligned sample.
If you do docker image ls
, what version of the prepareaa docker do you have?
Did it work for your mm10 sample?
Description of the bug
Hi!
I'm running the pipeline with the following pipeline:
In the process
NFCORE_CIRCDNA:CIRCDNA:AMPLICONCLASSIFIER_AMPLICONSIMILARITY
I get the following error:At first I though it could be a problem due to being a relative path or so, but using
Yields the same error but with the full path.
Command used and terminal output
No response
Relevant files
No response
System information
No response