galaxy001 / pirs

profile basd Illumina pair-end Reads Simulator
https://code.google.com/p/pirs/
GNU General Public License v2.0
26 stars 7 forks source link

Request for better documentation on baseCalling_Matrix_*.pl #9

Open GodloveD opened 4 years ago

GodloveD commented 4 years ago

It would be great if there was some better documentation explaining how to use baseCalling_Matrix_*.pl scripts.

For instance baseCalling_Matrix_merger.pl lists <matrix list> and <merged outfile> as input. But it's not stated how <matrix list> should be provided. Should it be space delimited, comma delimited, colon delimited, enclosed in single or double quotes, enclosed in brackets or curly braces, etc. Trying many of these combinations has not worked, and it is unclear how we should proceed.

Thanks for considering this request!

galaxy001 commented 4 years ago

The <matrix list> is a plain text list of profile filenames, each filename takes one line.

Like this:

./xxx/yyy.matrix.txt
./abc.matrix.txt
moo.matrix.txt

baseCalling_Matrix_merger.pl is for merge plain text files like Profiles/Base-Calling_Profiles/HG00702.PE91.matrix.gz, but you need un-gzip them first.

GodloveD commented 4 years ago

I think what you may be trying to say is that <matrix list> is a file containing a newline delimited list of profile filenames that need to be merged. Is that correct? Thank you for your clarification.

zhiliu-git commented 4 years ago

Hi, could you clarify how these *.profile are generated?

I looked into all of the example files that came within the pirs package, but there is no file ended with .profile. When I ran baseCalling_Matrix_calculator on my bam files to generate base calling matrices, the outputs were ended with: 1).ratio.matrix; 2).stat; 3).count.matrix and 4).info, but I'm not sure how to get to the .profile. I tried tarballing all four output files, named them as .profile and put all generated *.profile into a text file as the input but that didn't work. I also tried putting each type of the four outputs into the matrix list file and that didn't work either. The result was always the same: 35 please check the input files carefully at /usr/local/apps/pirs/2.0.2/bin/baseCalling_Matrix_merger.pl line 55, <$handles[...]> line 1.

GodloveD commented 4 years ago

@galaxy001 are you able to provide feedback on the question above? Thank you.

galaxy001 commented 4 years ago

I modify previous answer to use .matrix.txt to show the relation with Profiles/Base-Calling_Profiles/*.matrix.gz.

You can use baseCalling_Matrix_calculator.pl to generate prefix.count.matrix for a "Base-Calling_Profiles". You'd better gzip it before put it into that path.

The profile name should better be Profiles/Base-Calling_Profiles/xxx.PEnnn.matrix.gz. I cannot remember whether the C++ code load all files or just pick up some filename pattern. I have lost contact with the C++ guy.