RitchieLabIGH / IRFinder

MIT License
13 stars 10 forks source link

IRFinder diff - Deseq2 problem with too many elements in a line #30

Closed nkaplin1 closed 1 year ago

nkaplin1 commented 1 year ago

Hi -

I'm trying to run IRFinder diff using Deseq2 and am getting the following error in the log.err file:

Attaching package: 'Biobase'

The following object is masked from 'package:MatrixGenerics':

rowMedians

The following objects are masked from 'package:matrixStats':

anyMissing, rowMedians

Error in scan(file = file, what = what, sep = sep, quote = quote, dec = dec, : line 65 did not have 21 elements Calls: DESeqDataSetFromIRFinder -> read.table -> scan Execution halted

Any clue what I may have set up incorrectly?

nkaplin1 commented 1 year ago

Solved my own problem. Some of my Arabidopsis gene names had spaces in them which were being interpreted as new columns in the data table. Fixed the issue using this Perl script:

!/usr/bin/perl

use strict;

use warnings; use File::Copy;

script to remove spaces from gene names in IRFinder output files that cause Deseq2 to fail - this is a problem with the Ensembl Arabidopsis annotation file

run in the parent directory that contains all of the IRFinder output directories you want to fix - you can replace . with a target directory

my $parent_dir = '.';

IRFinder-IR-nondir.txt files will be fixed - replace file name to fix other files

my $file_name = 'IRFinder-IR-nondir.txt';

open the parent directory

opendir(my $dh, $parent_dir) or die "Cannot open directory: $!";

iterate through all the files in the directory

while (my $sub_dir = readdir($dh)) {

Skip the . and .. files

next if ($sub_dir eq '.' or $sub_dir eq '..');

my $dir_path = "$parent_dir/$sub_dir";
#Make sure the file is a directory
next unless (-d $dir_path);

my $file_path = "$dir_path/$file_name";
#make sure the target file exists
if (-e $file_path) {
    #Back up the original file
    copy($file_path, "$file_path.original") or die "Copy failed: $!";
    print "Copied $file_path\n";

    #Open the original and new files 
    open my $in,  '<', "$file_path.original" or die "Can't read old file: $!";
    open my $out, '>', "$file_path" or die "Can't write new file: $!";

    #Remove all spaces
    while($line = <$in>) {
            $line=~s/ /_/g;
            print $out $line;
        }

    close($in);
    close($out);

    print "Cleaned $file_path\n";
    $count++;
}

} closedir($dh);

print "Cleaned $count $file_name files\n";