shenlab-sinai / ngsplot

Quick mining and visualization of NGS data by integrating genomic databases
Other
252 stars 65 forks source link

Error with own regions #85

Open mpaya opened 5 years ago

mpaya commented 5 years ago

I'm using a genome that is neither in ensembl nor UCSC, so I made up a -G value and transformed my gff genes into bed format (chrom_name start end gene_name score strand). The documentation does not include many examples for these cases, so I'll indicate here my tries. I have two tissues that I want to plot together. One has duplicates, the other doesn't. The bam files were processed with Picard MarkDuplicates without removal.

  1. First, create configuration files to include paths to both tissues specifying that I want all regions plot on each of them.
    ../../ChIP_leaf/bam_files/2_filt-markdup_C1.bam:../../ChIP_leaf/bam_files/2_filt-markdup_C3.bam -1  "Leaves"
    ../..ChIP_FL/bams/2_filt-markdup_FL_C1.bam  -1  "Flowers"

Then, run ngs.plot.r with my made-up genome name, specifying my bed file with -E, the above config file and other parameters. It does not work.

ngs.plot.r -G Bra3.0 -R bed -E gene_lists/Bra_3.0_genes.bed -C config_both.txt -O both -P 6 -FL 300 -IN 1 

Configuring variables...Error in file(file, "rt") : cannot open the connection
Calls: SetupPlotCoord -> ReadBedCoord -> read.table -> file
In addition: Warning messages:
1: In ReadBedCoord(ur) : File name: '-1' does not seem to a correct name 
for bed file.

2: In file(file, "rt") : cannot open file '-1': No such file or directory
Execution halted
  1. I modify my config files and indicate my bed file inside instead of '-1'.

    ngs.plot.r -G Bra3.0 -R bed  -C config_both.txt -O both -P 6 -FL 300 -IN 1
    Configuring variables...Done
    Loading R libraries.....Done
    Analyze bam files and calculate coverageError in bamFileList(ctg.tbl) : 
    No mix of bam and bam-pair allowed in configuration.
    Execution halted
  2. Modify the config and duplicate the single file. All runs like normal. I get negative fold changes for the duplicates and a flat line for the doubled file. Heatmaps are blank.

  3. Merge bam files (MergeSamFiles tool from Galaxy) of duplicates and run pairs of ChIP:input. Here, leaves (merged files) have significantly higher values than flowers (single sample). Still, no heatmap.

    ../../ChIP_leaf/bam_files/2_filt-markdup_ChIP.bam:../../ChIP_leaf/bam_files/2_filt-markdup_INPUT.bam    gene_lists/Bra_3.0_genes.bed    "Leaves"
    ../../ChIP_FL/bams/2_filt-markdup_FL_C1.bam:../../ChIP_FL/bams/2_filt-markdup_FL_I1.bam gene_lists/Bra_3.0_genes.bed    "Flowers"
    ngs.plot.r -G Bra3.0 -R bed -C config_both.txt -O both -P 6 -FL 300 -IN 1
    Configuring variables...Done
    Loading R libraries.....Done
    Analyze bam files and calculate coverageWarning messages:
    1: In headerIndexBam(bam.list) :
    Aligner for: ../../ChIP_leaf/bam_files/2_filt-markdup_ChIP.bam cannot be determined. Style of 
    standard SAM mapping score will be used. Would you mind submitting an issue 
    report to us on Github? This will benefit people using the same aligner.
    2: In headerIndexBam(bam.list) :
    Aligner for: ../../ChIP_leaf/bam_files/2_filt-markdup_INPUT.bam cannot be determined. Style of 
    standard SAM mapping score will be used. Would you mind submitting an issue 
    report to us on Github? This will benefit people using the same aligner.
    Warning message:
    'isNotPrimaryRead' is deprecated.
    Use 'isSecondaryAlignment' instead.
    See help("Deprecated") 

My observations.

As a suggestion, adding some extra information on the documentation would help avoid these issues.