sjroth / ARTDeco

MIT License
15 stars 7 forks source link

Preprocessing: AttributeError: 'DataFrame' object has no attribute 'append'. Did you mean: '_append'? #24

Closed faleevz closed 3 months ago

faleevz commented 3 months ago

Hello, Trying to run preprocessing, but keep running into this issue.

(artdeco) [mfaleeva@vmpr-res-cluster1 ARTDeco]$ ARTDeco -mode preprocess -gtf-file /home/mfaleeva/ARTDeco/newgenes.gtf -chrom-sizes-file /home/mfaleeva/ARTDeco/hg38.chrom.sizes [-home-dir /home/mfaleeva/ARTDeco/ -bam-files-dir /home/mfaleeva/ARTDeco/bam -layout PE -stranded TRUE -orientation Forward -cpu 1 -read-in-dist 1000 -readthrough-dist 500 -intergenic-min-len 100 -intergenic-max-len 15000 -meta-file /home/mfaleeva/ARTDeco/meta.txt -comparisons-file /home/mfaleeva/ARTDeco/comparisons.txt]

Running preprocess mode...
Loading ARTDeco file structure...
Meta file properly formatted... Generating reformatted meta...
Comparison file does not exist or not provided... Generating comparisons file...
ARTDeco will generate the following files:
./preprocess_files/5NP_LATE_R2_S10.sort.dedup
./preprocess_files/gene_types.txt
./preprocess_files/5dP_LATE_R2_S12.sort.dedup
./preprocess_files/genes.full.bed
./preprocess_files/5dP_LATE_R1_S4.sort.dedup
./preprocess_files/5NP_LATE_R1_S2.sort.dedup
./preprocess_files/genes_condensed.bed
./preprocess_files/5dD_LATE_R2_S11.sort.dedup
./preprocess_files/gene_to_transcript.txt
./preprocess_files/readthrough.bed
./preprocess_files/5ND_LATE_R1_S1.sort.dedup
./preprocess_files/5ND_LATE_R2_S9.sort.dedup
./preprocess_files/read_in.bed
GTF file needed... Checking...
GTF file exists...
BAM file format needed... Checking... Will infer if not user-specified.
BAM files specified as paired-end...
BAM files specified as stranded...
BAM files specified as forward-strand oriented...
Summarizing BAM file stats...
7 Experiments
Files are Paired-End, Strand-Specific, Forward-strand oriented
                                               Experiment  Total Reads  Mapped Reads
 /home/mfaleeva/ARTDeco/bam/5NP_LATE_R1_S2.sort.dedup.bam     20052400      20052400
 /home/mfaleeva/ARTDeco/bam/5ND_LATE_R2_S9.sort.dedup.bam     22254209      22254209
/home/mfaleeva/ARTDeco/bam/5dD_LATE_R2_S11.sort.dedup.bam     20323589      20323589
/home/mfaleeva/ARTDeco/bam/5NP_LATE_R2_S10.sort.dedup.bam      1973197       1973197
 /home/mfaleeva/ARTDeco/bam/5ND_LATE_R1_S1.sort.dedup.bam     15258988      15258988
/home/mfaleeva/ARTDeco/bam/5dP_LATE_R2_S12.sort.dedup.bam     30559820      30559820
 /home/mfaleeva/ARTDeco/bam/5dP_LATE_R1_S4.sort.dedup.bam     18114441      18114441
Convert GTF to BED...
Warning: If your Wiggle data is a significant portion of available system memory, use the --max-mem and --sort-tmpdir options, or use --do-not-sort to disable post-conversion sorting. See --help for more information.
Generating condensed genes bed...
/home/mfaleeva/.conda/envs/artdeco/lib/python3.10/site-packages/ARTDeco/preprocess.py:59: DtypeWarning: Columns (0) have mixed types. Specify dtype option on import or set low_memory=False.
  genes = pd.read_csv(bed_file,sep='\t',header=None,
Generating read-in region BED file...
Traceback (most recent call last):
  File "/home/mfaleeva/.conda/envs/artdeco/bin/ARTDeco", line 8, in <module>
    sys.exit(main())
  File "/home/mfaleeva/.conda/envs/artdeco/lib/python3.10/site-packages/ARTDeco/main.py", line 423, in main
    read_in_df = create_stranded_read_in_df(genes,chrom_sizes,max_len=args.intergenic_max_len,
  File "/home/mfaleeva/.conda/envs/artdeco/lib/python3.10/site-packages/ARTDeco/preprocess.py", line 269, in create_stranded_read_in_df
    read_in = pos_strand.append(neg_strand)
  File "/home/mfaleeva/.conda/envs/artdeco/lib/python3.10/site-packages/pandas/core/generic.py", line 6296, in __getattr__
    return object.__getattribute__(self, name)
AttributeError: 'DataFrame' object has no attribute 'append'. Did you mean: '_append'?

This generates the meta.reformatted.txt, genes.full.bed, genes_condensed.bed, genes_types.txt, gene_to_transcript.txt, and comparisons.reformatted.txt file. There is also a summary_file (bam_summary.txt).

Any insight would be greatly appreciated! Thank you

sjroth commented 3 months ago

I have never seen this error in the context of ARTDeco. What version of Python and Pandas are you using?

faleevz commented 3 months ago

Pandas is 2.21 and Python 3.10

sjroth commented 3 months ago

The Pandas version for which ARTDeco is written is 0.24 and Python is 3.6 as contained within the conda yaml file. I will likely update ARTDeco in the coming months to account for new developments when I get a chance.

faleevz commented 3 months ago

Fixed- Pandas deprecated append in 2.0.0 version, downgraded to a pandas version of 1.5.3. Thanks!