akikuno / DAJIN2

๐Ÿ”ฌ Genotyping tool for genome-edited samples, utilizing nanopore sequencer target sequencing
MIT License
9 stars 0 forks source link

two more errors in execution (one fixed) #24

Closed takeiga closed 8 months ago

takeiga commented 8 months ago

Another errors were occurred when I tried another PC.

I conducted same commands after successful install in another PC: DAJIN2 --control barcode01/ --sample barcode02/ --allele actc1L_cont_knockin.fa --name act1c --genome xenLae2 --threads 8

But another error was occurred:

2024-03-26 18:29:11, INFO, barcode01/ is now processing...
2024-03-26 18:29:14, INFO, Preprocess barcode01/...
2024-03-26 18:30:27, INFO, Output BAM files of barcode01/...
2024-03-26 18:30:28, INFO, ๐Ÿต barcode01/ is finished!
2024-03-26 18:30:28, INFO, barcode02/ is now processing...
2024-03-26 18:30:31, INFO, Preprocess barcode02/...
2024-03-26 18:30:38, ERROR, Catch an Exception. Traceback:
Traceback (most recent call last):
  File "/home/igawa/miniconda3/envs/dajin2/bin/DAJIN2", line 10, in <module>
    sys.exit(execute())
  File "/home/igawa/miniconda3/envs/dajin2/lib/python3.10/site-packages/DAJIN2/main.py", line 236, in execute
    execute_single_mode(arguments)
  File "/home/igawa/miniconda3/envs/dajin2/lib/python3.10/site-packages/DAJIN2/main.py", line 48, in execute_single_mode
    core.execute_sample(arguments)
  File "/home/igawa/miniconda3/envs/dajin2/lib/python3.10/site-packages/DAJIN2/core/core.py", line 120, in execute_sample
    preprocess.cache_mutation_loci(ARGS, is_control=False)
  File "/home/igawa/miniconda3/envs/dajin2/lib/python3.10/site-packages/DAJIN2/core/preprocess/mutation_extractor.py", line 322, in cache_mutation_loci
    mutation_loci = extract_mutation_loci(
  File "/home/igawa/miniconda3/envs/dajin2/lib/python3.10/site-packages/DAJIN2/core/preprocess/mutation_extractor.py", line 280, in extract_mutation_loci
    anomal_loci = extract_anomal_loci(
  File "/home/igawa/miniconda3/envs/dajin2/lib/python3.10/site-packages/DAJIN2/core/preprocess/mutation_extractor.py", line 128, in extract_anomal_loci
    idx_outliers = detect_anomalies(values_sample, values_control, thresholds[mut], is_consensus)
  File "/home/igawa/miniconda3/envs/dajin2/lib/python3.10/site-packages/DAJIN2/core/preprocess/mutation_extractor.py", line 111, in detect_anomalies
    kmeans = MiniBatchKMeans(n_clusters=2, random_state=0, n_init="auto").fit(values_subtract_reshaped)
  File "/home/igawa/miniconda3/envs/dajin2/lib/python3.10/site-packages/sklearn/cluster/_kmeans.py", line 1960, in fit
    self._check_params(X)
  File "/home/igawa/miniconda3/envs/dajin2/lib/python3.10/site-packages/sklearn/cluster/_kmeans.py", line 1792, in _check_params
    super()._check_params(X)
  File "/home/igawa/miniconda3/envs/dajin2/lib/python3.10/site-packages/sklearn/cluster/_kmeans.py", line 818, in _check_params
    if self.n_init <= 0:
TypeError: '<=' not supported between instances of 'str' and 'int'

I could fix this by updating scikit-learn by pip: pip install -U scikit-learn

However, an another error was occurred after fixing scikit-learn:

2024-03-26 18:46:37, INFO, barcode01/ is now processing...
2024-03-26 18:46:39, INFO, Preprocess barcode01/...
2024-03-26 18:47:52, INFO, Output BAM files of barcode01/...
2024-03-26 18:47:53, INFO, ๐Ÿต barcode01/ is finished!
2024-03-26 18:47:53, INFO, barcode02/ is now processing...
2024-03-26 18:47:56, INFO, Preprocess barcode02/...
2024-03-26 18:48:06, INFO, Classify barcode02/...
2024-03-26 18:48:07, INFO, Clustering barcode02/...
2024-03-26 18:48:28, INFO, Consensus calling of barcode02/...
2024-03-26 18:50:15, INFO, Output reports of barcode02/...
2024-03-26 18:50:15, ERROR, Catch an Exception. Traceback:
Traceback (most recent call last):
  File "/home/igawa/miniconda3/envs/dajin2/bin/DAJIN2", line 10, in <module>
    sys.exit(execute())
  File "/home/igawa/miniconda3/envs/dajin2/lib/python3.10/site-packages/DAJIN2/main.py", line 236, in execute
    execute_single_mode(arguments)
  File "/home/igawa/miniconda3/envs/dajin2/lib/python3.10/site-packages/DAJIN2/main.py", line 48, in execute_single_mode
    core.execute_sample(arguments)
  File "/home/igawa/miniconda3/envs/dajin2/lib/python3.10/site-packages/DAJIN2/core/core.py", line 214, in execute_sample
    report.report_bam.export_to_bam(
  File "/home/igawa/miniconda3/envs/dajin2/lib/python3.10/site-packages/DAJIN2/core/report/report_bam.py", line 127, in export_to_bam
    write_sam_to_bam(sam_headers + sam_content, path_sam_output, path_bam_output, THREADS)
  File "/home/igawa/miniconda3/envs/dajin2/lib/python3.10/site-packages/DAJIN2/core/report/report_bam.py", line 86, in write_sam_to_bam
    Path(path_sam).write_text(formatted_sam + "\n")
  File "/home/igawa/miniconda3/envs/dajin2/lib/python3.10/pathlib.py", line 1154, in write_text
    with self.open(mode='w', encoding=encoding, errors=errors, newline=newline) as f:
  File "/home/igawa/miniconda3/envs/dajin2/lib/python3.10/pathlib.py", line 1119, in open
    return self._accessor.open(self, mode, buffering, encoding, errors,
FileNotFoundError: [Errno 2] No such file or directory: 'DAJIN_Results/.tempdir/act1c/report/bam/tmp240891_allele1_control_indels_40.876%.sam'

It may be an error in dajin2 code, specifically in pathlib.py? I'm happy if this error will be fixed by dajin2 update.

Thanks,

akikuno commented 8 months ago

Thank you for the information, and I apologize for any inconvenience this may have caused you. Currently, the reason for the bug is unclear to me. If possible, could you please run your analysis again with the --debug option added? This option retains the temporary directory, allowing you to access all the intermediate files.

DAJIN2 --control barcode01 \
    --sample barcode02 \
    --allele actc1L_cont_knockin.fa  \
    --name act1c_debug \
    --genome xenLae2 \
    --threads 8 \
    --debug

After executing DAJIN2, could you check the presence of the sam/bam file and provide me with the results of the following two commands?

ls -l DAJIN_Results/.tempdir/act1c_debug/barcode02/sam/
ls -l DAJIN_Results/.tempdir/act1c_debug/report/bam/

They will help in diagnosing the issue.

takeiga commented 8 months ago

Thanks again for your kind support! I conducted run with --debug option, but DAJIN_Results/.tempdir/act1c_debug/barcode02/sam/ was not made and DAJIN_Results/.tempdir/02/barcode02/sam instead.

Then, "ls -l DAJIN_Results/.tempdir/02/barcode02/sam" showed:

ls -l DAJIN_Results/.tempdir/02/barcode02/sam
ๅˆ่จˆ 26400
-rw-rw-r-- 1 igawa igawa 6483084  3ๆœˆ 28 15:08 map-ont_Knock-in.sam
-rw-rw-r-- 1 igawa igawa 6845200  3ๆœˆ 28 15:08 map-ont_control.sam
-rw-rw-r-- 1 igawa igawa 6896784  3ๆœˆ 28 15:08 splice_Knock-in.sam
-rw-rw-r-- 1 igawa igawa 6799747  3ๆœˆ 28 15:08 splice_control.sam

In same manner,

ls -l DAJIN_Results/.tempdir/02/report/BAM
ๅˆ่จˆ 5156
drwxrwxr-x 2 igawa igawa    4096  3ๆœˆ 28 15:08 barcode01
drwxrwxr-x 2 igawa igawa    4096  3ๆœˆ 28 15:15 barcode02
-rw-rw-r-- 1 igawa igawa 5270935  3ๆœˆ 28 15:15 tmp240891_barcode02_control.sam
akikuno commented 8 months ago

Thank you! It seems that DAJIN2 generated a proper temporary SAM file... Could you please inform me about the error message that occurred at this run? Sorry for your inconvenience๐Ÿ™‡

takeiga commented 8 months ago

Yes, all files seemed to be generated correctly. The stdout of this run was following (same as my first report in this thread):

DAJIN2 --control barcode01 --sample barcode02 --allele actc1L_cont_knockin.fa --name 02 --genome xenLae2 --threads 8 --debug
2024-03-28 15:05:21, INFO, barcode01 is now processing...
2024-03-28 15:05:24, INFO, Preprocess barcode01...
2024-03-28 15:08:28, INFO, Output BAM files of barcode01...
2024-03-28 15:08:29, INFO, ๐Ÿต barcode01 is finished!
2024-03-28 15:08:29, INFO, barcode02 is now processing...
2024-03-28 15:08:32, INFO, Preprocess barcode02...
2024-03-28 15:08:57, INFO, Classify barcode02...
2024-03-28 15:08:58, INFO, Clustering barcode02...
2024-03-28 15:09:56, INFO, Consensus calling of barcode02...
2024-03-28 15:15:07, INFO, Output reports of barcode02...
2024-03-28 15:15:07, ERROR, Catch an Exception. Traceback:
Traceback (most recent call last):
  File "/home/igawa/miniconda3/envs/dajin2/bin/DAJIN2", line 10, in <module>
    sys.exit(execute())
  File "/home/igawa/miniconda3/envs/dajin2/lib/python3.10/site-packages/DAJIN2/main.py", line 236, in execute
    execute_single_mode(arguments)
  File "/home/igawa/miniconda3/envs/dajin2/lib/python3.10/site-packages/DAJIN2/main.py", line 48, in execute_single_mode
    core.execute_sample(arguments)
  File "/home/igawa/miniconda3/envs/dajin2/lib/python3.10/site-packages/DAJIN2/core/core.py", line 214, in execute_sample
    report.report_bam.export_to_bam(
  File "/home/igawa/miniconda3/envs/dajin2/lib/python3.10/site-packages/DAJIN2/core/report/report_bam.py", line 127, in export_to_bam
    write_sam_to_bam(sam_headers + sam_content, path_sam_output, path_bam_output, THREADS)
  File "/home/igawa/miniconda3/envs/dajin2/lib/python3.10/site-packages/DAJIN2/core/report/report_bam.py", line 86, in write_sam_to_bam
    Path(path_sam).write_text(formatted_sam + "\n")
  File "/home/igawa/miniconda3/envs/dajin2/lib/python3.10/pathlib.py", line 1154, in write_text
    with self.open(mode='w', encoding=encoding, errors=errors, newline=newline) as f:
  File "/home/igawa/miniconda3/envs/dajin2/lib/python3.10/pathlib.py", line 1119, in open
    return self._accessor.open(self, mode, buffering, encoding, errors,
FileNotFoundError: [Errno 2] No such file or directory: 'DAJIN_Results/.tempdir/02/report/bam/tmp240891_allele1_control_indels_40.876%.sam'
akikuno commented 8 months ago

@takeiga Thank you so much for your reports!

Regarding the error with Path(path_sam).write_text(formatted_sam + "\n"), there is no problem with the directory DAJIN_Results/.tempdir/02/report/bam/ because it was properly created, so the issue might lie in the filename tmp240891_allele1_control_indels_40.876%.sam. Considering that tmp240891_barcode02_control.sam is created successfully, I suspect the % sign might be causing the issue. (I have never encountered this % sing issue on my Ubuntu and macOS enviroments, so I don't think the possibility is high honestly...)

If it's not too much trouble, could you please verify if the following commands produce an error? If the % is the problem, the first command should result in an error, and the second should successfully display "test2".

echo "test1" > DAJIN_Results/.tempdir/02/report/bam/tmp240891_allele1_control_indels_40.876%.sam

echo "test2" > DAJIN_Results/.tempdir/02/report/bam/tmp240891_allele1_control_indels_40.876pct.sam

cat DAJIN_Results/.tempdir/02/report/bam/tmp240891_allele1_control_indels_40.876%.sam

cat DAJIN_Results/.tempdir/02/report/bam/tmp240891_allele1_control_indels_40.876pct.sam
takeiga commented 8 months ago

@akikuno Those commands generated "ใใฎใ‚ˆใ†ใชใƒ•ใ‚กใ‚คใƒซใ‚„ใƒ‡ใ‚ฃใƒฌใ‚ฏใƒˆใƒชใฏใ‚ใ‚Šใพใ›ใ‚“" Because there are no "bam" folder but "BAM" (capitalized) folder in my run.

So I tried

echo "test1" > DAJIN_Results/.tempdir/02/report/BAM/tmp240891_allele1_control_indels_40.876%.sam
echo "test2" > DAJIN_Results/.tempdir/02/report/BAM/tmp240891_allele1_control_indels_40.876pct.sam
cat DAJIN_Results/.tempdir/02/report/BAM/tmp240891_allele1_control_indels_40.876%.sam
cat DAJIN_Results/.tempdir/02/report/BAM/tmp240891_allele1_control_indels_40.876pct.sam

This resulted following without any error.

cat DAJIN_Results/.tempdir/02/report/BAM/tmp240891_allele1_control_indels_40.876%.sam
test1
cat DAJIN_Results/.tempdir/02/report/BAM/tmp240891_allele1_control_indels_40.876pct.sam
test2

Folder specification of "bam" instead of "BAM" may cause problem?

akikuno commented 8 months ago

Folder specification of "bam" instead of "BAM" may cause problem?

Thank you for your confirmation! I'm thrilled with the feedback. I'll promptly revise the code and ensure DAJIN2 is upgraded, potentially by the end of this week.

takeiga commented 8 months ago

I am very surprised and grateful for your quick response. I'm looking forward upgraded version, but please do this when you have time.

takeiga commented 8 months ago

I'm very grateful for your quick update of DAJIN2!! After updating DAJIN2 into 0.4.3, I faced another error caused by another point of update, genome_fetcher.py, but I could successfully completed run (yey!) when I replaced geneme_fethcer.py. I'll report this error in a another thread.