pepkit / geofetch

Builds a PEP from SRA or GEO accessions
https://pep.databio.org/geofetch/
BSD 2-Clause "Simplified" License
45 stars 5 forks source link

colons in values #92

Closed nsheff closed 1 year ago

nsheff commented 1 year ago

geofetch should quote any constant columns its sticking in the config, since embedded colons are causing problems.

khoroshevskyi commented 1 year ago

Do you have an GSE example to check if it is still a problem?

nsheff commented 1 year ago

GSE181846

khoroshevskyi commented 1 year ago

In geofetch0.11.0 this error is solved by deleting all special characters. Not sure if it's the best solution, but I decided to do this, as quoted text was causing errors, because some of the text in GEO and NCBI had double or single quotes.

khoroshevskyi commented 1 year ago

geofetch -i GSE181846 --metadata-folder `pwd` --just-metadata --const-limit-project 1 --const-limit-discard 10000 --attr-limit-truncate 10000 You can run this command using geofetch from this branch: https://github.com/pepkit/geofetch/tree/refactoring

nsheff commented 1 year ago

I would suggest:

  1. quote the fields
  2. escape any internal quotes.

Would that work? I think it's better than stripping characters.