berman-lab / ymap

YMAP - Yeast Mapping Analysis Pipeline : An online pipeline for the analysis of yeast genomic datasets.
MIT License
6 stars 6 forks source link

Running ddRADseq without fragment-length bias correction gives errors #23

Closed vladimirg closed 7 years ago

vladimirg commented 9 years ago

Undefined function or variable "cP_dist_is_L".

Error in ChARM_v4 (line 466) Pcond_dist_is_L{chr} = cP_dist_is_L;

Error in processing2 (line 4) ChARM_v4('test-C-10','testuser','C_albicans_SC5314_vA21-s02-m09-r07','testuser','/Users/bermanlab/dev/ymap/scripts_seqModules/scripts_ddRADseq/../../');

Error in run (line 63) evalin('caller', [script ';']);

vladimirg commented 8 years ago

From a previous discussion with Darren, the cause may have been an incorrectly processed reference genome. The chromosome_sizes.txt had this:

# Chr   size(bp)    name
1   61  Chr1
2   61  Chr2
3   61  Chr3
4   61  Chr4
5   61  Chr5
6   61  Chr6
7   61  Chr7
8   61  Chr8
9   61  Chr9

So perhaps the question should be, how did the genome break this way?

ghost commented 8 years ago

the sizes are calculated by genome.install_1.php and then passed as a session variable to genome.install_2.php which generates the chromosome_sizes.txt and writes only chromosomes that were chosen to be drawn. The code seems to work fine, so I think there are two possible reasons for the failure:

  1. the input fasta file was invalid
  2. According to the FASTA format in Wikpedia a fasta file can contain lines of comment which start in ";" or an empty line (which can contain only whitespaces and they will still be counted) and currently the code doesn't ignore them (lines 136-149 in genome.install_1.php), the code expects the header line to be first and the in the next line the data and so on. Maybe we should consider ignoring these lines but I doubt that they are the reason for the ouput mentioned above since then the file would have contained more lines.
darrenabbey commented 8 years ago

I'm away from a computer where I can check this. How are the chromosome name strings defined in the example genome? The code to parse out the usable names may incorporate some false assumptions about how the strings are formatted.

On Aug 17, 2016 2:01 PM, "Vladimir Gritsenko" notifications@github.com wrote:

From a previous discussion with Darren, the cause may have been an incorrectly processed reference genome. The chromosome_sizes.txt had this:

Chr size(bp) name

1 61 Chr1 2 61 Chr2 3 61 Chr3 4 61 Chr4 5 61 Chr5 6 61 Chr6 7 61 Chr7 8 61 Chr8 9 61 Chr9

So perhaps the question should be, how did the genome break this way?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/berman-lab/ymap/issues/23#issuecomment-240547019, or mute the thread https://github.com/notifications/unsubscribe-auth/AKPuRMx9B-LLllaKURsALVKDo3Oiq5fkks5qg3a7gaJpZM4FzJuK .

vladimirg commented 8 years ago

@darrenabbey , unfortunately, my documentation is very sparse, so I can't say :( the output I gave was copied from a conversation we had more than a year ago. Since it had something to do with genome uploading, and since most users don't upload their own genomes, I decided to de-prioritize this one, until time permits otherwise (or another user reports this error).

vladimirg commented 7 years ago

Can't reproduce, closing.