bcgsc / NanoSim

Nanopore sequence read simulator
Other
217 stars 51 forks source link

Better descriptions of input file formats #83

Closed fairliereese closed 3 years ago

fairliereese commented 4 years ago

I think that many people will find this tool more useful if you provide example file formats for the many types of reference files that you are expecting people to use. For instance, I am trying to simulate transcriptome reads and your README simply says I need to use "a reference transcriptome". I went with the GENCODE transcriptome simply because that's what the rest of the analysis I've been doing uses. However it's incompatible with your tool. For those having difficulty getting things to work, if your error looks like this: Screen Shot 2020-06-02 at 11 19 47 AM it's because the software expects your transcriptome fasta file to look like this: Screen Shot 2020-06-02 at 11 21 16 AM With the fasta headers space separated and the first field having the transcript ID.

cheny19 commented 4 years ago

Sure! I'll update the README in the new release as well!

fairliereese commented 4 years ago

Thanks for your responsiveness! Edit: I see now that there is a short description of this specific input file format on the README. I apologize for not seeing it before. Hopefully this will still hope users that are having similar problems though!

cheny19 commented 4 years ago

It was not very clear before, now I added to the README that the transcripts name has to be consistent in all the input files. Thanks for the suggestion!