mcanouil / eggla

Early Growth Genetics Longitudinal Analysis.
https://m.canouil.dev/eggla/
Other
2 stars 1 forks source link

GWAS function output file to tab separated #115

Closed annihei closed 1 year ago

annihei commented 1 year ago

Bug description

Could we change the run_eggla_gwas function's summary stats output into tab separated .txt.gz file? I think this was discussed before and is like that in the analysis plan but at the moment the output comes as comma separated. The csv files are difficult in the QC process I have to currently convert them into tab separated before the QC.

eggla version output

> packageVersion('eggla')
[1] ‘0.19.5’

Checklist

mcanouil commented 1 year ago

Should we also properly name the file? .tsv.gz

annihei commented 1 year ago

We are now using easyQC for the quality control and I originally named the converted files as .tsv; it could not read them. I don't know how the QC will be carried out in the next wave but if they are going to use easyQC it would be good if the filenames were given in usable form.

mcanouil commented 1 year ago

If I understand, the idea is to be sure the GWAS results files correspond to what easyQC expects?

renaming txt.gz to tsv.gz should not change anything, it will simply ensure you know it's tab-separated directly from the file extension. So does easyQC really wants files to have "txt" extension in their names?

annihei commented 1 year ago

Yes, this was the issue. I could not find in the documentation a list of file extensions it accepts, but it did not work with tab separated tsv.gz files, and after changing nothing else but the name to txt.gz it was fine.

mcanouil commented 1 year ago

That's pretty badly implemented in my opinion, but ok, let's go for txt.gz then.

mcanouil commented 1 year ago

I'll make the change/update later this week.

By the way, feel free to make a pull request for these things. I this case, it's adding sep = "\t" in the following line of code.

https://github.com/mcanouil/eggla/blob/4538ff2d53424f94cca44f188c1006c952fd4929/R/run_eggla_gwas.R#L662-L663

annihei commented 1 year ago

Yeah, I have started to feel like it is a bit out of date with some features but that's the way we went with the QC. As said, I think someone else might be working with the QC in the next wave with different software but I think for now the idea was to react if there are some things that come up with the analysis on these six cohorts.

Re pull request I am sorry I am really useless with git(hub) and afraid of touching anything! I have been meaning to learn some basics with this project but haven't found the time for it. It is still high in my list I know it would help with other work as well but meanwhile I am sorry for making your harder with being like this :)

mcanouil commented 1 year ago

Yeah, I have started to feel like it is a bit out of date with some features but that's the way we went with the QC. As said, I think someone else might be working with the QC in the next wave with different software but I think for now the idea was to react if there are some things that come up with the analysis on these six cohorts.

I also think, that's the best move. I don't recall (but I can double check) if we explicitly wrote in the analytical plan, that we were going to use easyQC.£ If not, probably worth updating both the file type and adding the tool "we" use.

Re pull request I am sorry I am really useless with git(hub) and afraid of touching anything! I have been meaning to learn some basics with this project but haven't found the time for it. It is still high in my list I know it would help with other work as well but meanwhile I am sorry for making your harder with being like this :)

No worries/pressure. If you feel trying, you are welcome to do that, otherwise, it's fine, I'll make the change this week-end.

annihei commented 1 year ago

Thank you, Mickaël!