usgs / gspy

Other
18 stars 5 forks source link

ASEG GDF2 is fixed width format but read_csv assumes white space #14

Open bamburgh opened 10 months ago

bamburgh commented 10 months ago

Hi developers,

Thank you for your work on gspy.

I'm Mark Dransfield and looking to use gspy in my airborne gravity and gravity gradiometry QC code. My first trial data did not work because there is no white space between two of the data fields whereas read_csv assumes there will be. The data are public domain so I can easily send a copy of a data snippet on request (with the JSON metadata files I made).

The start of the first line in the .DAT file is:

DATA 100010.0 153.764 199.666 89.0742021/11/24 64.3

with LINE_NO HEIGHT1 HEIGHT2 BEARING DATE Terrain_Height

and there is no space between BEARING and DATE as you can see.

cheers Mark

leonfoks commented 10 months ago

Hi Mark!

Thanks for your interest in gspy! Could you please comment with a link to the data set since it’s public domain? I can download and get it working.

Thanks!

bamburgh commented 10 months ago

Hi Leon!

wondered if it would be you to respond. Trust all is well. The data are at:

https://geoscience.data.qld.gov.au/data/gravity-gradiometry/gg100099

cheers

leonfoks commented 8 months ago

Hey Mark,

Just letting you know we are working on this. There were multiple issues that came up while dealing with this data file so its taking a little longer. Once we have it sorted ill summarize it all here and we can discuss.

Thanks!

leonfoks commented 7 months ago

Hi @bamburgh ,

We've pushed a good number of changes related to this. a) We can now import files that have fortran formatted .dat files where there are no spaces between columns of data. On the Survey.add_tabular('aseg') we now have an extra keyword "fixed_format=True" to handle this. b) We also updated our reader to handle multiple fields on the same line.

Probably the biggest issue with the file you linked was the use of mu inside the dfn file. We could not get any ascii readers to successfully read that character in yet. So instead we currently detect non-convertible ascii characters and simply ask the user to replace those. I have a feeling that this is a unicode vs ascii issue, and we just need to dig a little deeper to determine what to do about symbols in the metadata.

Go ahead and re-install gspy, and keep us posted on how you get on!

bamburgh commented 7 months ago

Thank you Leon,I’ll test that out soon. The mu character I had not noticed. The ASEG-GDF2 standard is for ASCII so I think you have done the right thing. On the other hand, the standard is old and really should be updated to Unicode.CheersMarkDr Mark DransfieldOn 14 Feb 2024, at 05:05, Leon Foks @.***> wrote: Hi @bamburgh , We've pushed a good number of changes related to this. a) We can now import files that have fortran formatted .dat files where there are no spaces between columns of data. On the Survey.add_tabular('aseg') we now have an extra keyword "fixed_format=True" to handle this. b) We also updated our reader to handle multiple fields on the same line. Probably the biggest issue with the file you linked was the use of mu inside the dfn file. We could not get any ascii readers to successfully read that character in yet. So instead we currently detect non-convertible ascii characters and simply ask the user to replace those. I have a feeling that this is a unicode vs ascii issue, and we just need to dig a little deeper to determine what to do about symbols in the metadata. Go ahead and re-install gspy, and keep us posted on how you get on!

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you authored the thread.Message ID: @.***>

bamburgh commented 7 months ago

Hi Leon,

I upgraded to the latest gspy and my old survey meta data no longer works. The error message is suspicious to me because it complains to not have a dict but the code should accept a dict or a str (filename).

Hopefully the screen shot comes through and let me know if you want the survey metadata json file.

Mark

Image 21-2-2024 at 17 24