test data - Githubissues

HLHsieh commented 11 months ago

Hi,

Thank you for this amazing tool. I would like to test out loma and was wondering if you could add a small example dataset for users to demo the tool with.

Besides, how crucial is the quality of the reads for the analysis? I only have a set of FASTA files without quality scores. I'm contemplating whether I can still use these FASTA files with this algorithm or if it's feasible to simulate quality scores for the FASTA files to convert them into FASTQ files.

Also, I would like to confirm the version of mafft and minimap2 since it seems that "MAFFT ver.7 <=" is not available.

Thank you!

Best, Hsin

kolikem commented 11 months ago

Dear Hsin

Thank you very much for being interested in our work.

To the first question, two sample fastq files are provided for users to test loma, which are stored in "sample" in the repository. Please tell me if they do not work in your environment.

Second, in the algorithm, base qualities of all reads spanning a position are integrated to determine the consensus base at the position. So, the current version of loma asks users to prepare fastq file, not fasta. We are eager to adapt it for data without quality score, but unfortunately have not completed yet.

Third, the current version of loma was developed using the versions of minimap2 and mafft as written in readme. It is possible that loma works if you use older releases of mafft or minimap2 though it might not guarantee the result.

I hope this helps your research.

Regards and respect, Ko

j-schoenebeck commented 10 months ago

Hi Ko,

I too would like to use this interesting tool and I am having problems getting it to run even on the test data. But maybe I am misinterpreting the README. It says: "python 3.8 <= minimap2 ver.2.0 <= MAFFT ver.7 <= numpy"

To me, this means versions 'less than', ie MAFFT ver.7 or less... But in your response above to HLHSieh, you seem to imply versions 'greater than' should be ok. Can you tell me the exact versions that were used in development?

I hope MAFFT versions > 7 work, because I cannot find were to download versions less than this!

Many thanks! Jeff

kolikem commented 10 months ago

Hi Jeff,

Thank you for your inquiry. I am sorry for making you confused about the description of README. They actually do not mean 'less than', but 'greater than'. In answer to your point, I renewed README and improved the usability (please use the release v1.1.3). To get results, please make sure that you designate the absolute paths to the input and output.

Once again, thank you very much for sharing your problem and sorry for the inconvenience.

Best regards, Ko

j-schoenebeck commented 10 months ago

Hello Ko,

Thank you very much for your prompt reply to my comment. I am going to reinstall the prerequisites and give this a try again very soon,

By the way, my test case is to try to resolve a known SV (actually a CNV) using nanopore reads produced through adaptive sequencing. Thus coverage throughout the genome is very sparse, but quite a bit deeper at the CNV. It could be that I still don’t have enough depth for LoMa to resolve the haplotypes. But I figured I would give it a try nonetheless.

Best wishes,

Jeff

On 14 Dec 2023, at 17:32, kolikem @.***> wrote:

Hi Jeff,

Thank you for your inquiry. I am sorry for making you confused about the description of README. They actually do not mean 'less than', but 'greater than'. In answer to your point, I renewed README and improved the usability (please use the release v1.1.3). To get results, please make sure that you designate the absolute paths to the input and output.

Once again, thank you very much for sharing your problem and sorry for the inconvenience.

Best regards, Ko

— Reply to this email directly, view it on GitHub https://github.com/kolikem/loma/issues/3#issuecomment-1856278177, or unsubscribe https://github.com/notifications/unsubscribe-auth/AWLCZ6U7F6QALC4UEAD4ET3YJMZ4RAVCNFSM6AAAAAA7PBRLPWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNJWGI3TQMJXG4. You are receiving this because you commented.

kolikem / loma

test data #3