MushroomObserver / mushroom-observer

A website for sharing observations of mushrooms.
https://mushroomobserver.org
MIT License
77 stars 25 forks source link

iNat import #1955

Open mo-nathan opened 6 months ago

mo-nathan commented 6 months ago

Tasks

Done??

JoeCohen commented 5 months ago

iNat API documents:

JoeCohen commented 5 months ago

To play with API requests:

The very simple API request for a single iNat Observation (202555552). It returns tons of data. https://api.inaturalist.org/v1/observations?id=202555552

JoeCohen commented 5 months ago

Plan: start with a foreground job that imports a single iNat Observation without authentication. Later:

JoeCohen commented 4 months ago

Photos They're in aws. pseudocode:

obs[:observation_photos].each do |photo|
  aws_id = photo[:photo_id]
  import this: https://inaturalist-open-data.s3.amazonaws.com/photos/<aws_id>/original.jpeg
end

to get image to local tmp file (per Copilot)

require 'open-uri'

url = 'http://example.com/image.png' # Replace with the actual image URL
local_file_path = Rails.root.join('tmp', 'image.png') # Specify the local file path

# Download the image and save it locally
IO.copy_stream(open(url), local_file_path)

# Now `local_file_path` contains the downloaded image

Tempfile Class: The Tempfile class in Ruby provides a convenient way to manage temporary files. When you create a Tempfile object, it automatically generates a unique filename in the OS’s temp directory. You can perform standard file operations on it, such as reading, writing, and changing permissions. Here’s how you can use it:

require 'tempfile'

file = Tempfile.new('foo')
file_path = file.path # A unique filename in the OS's temp directory
file.write('hello world')
file.rewind
content = file.read
file.close
file.unlink # Deletes the temp file
JoeCohen commented 3 months ago

How do I copy an image from an external website to a (temp) file on the server?

Nathan Wilson nathan@collectivesource.com Sun, Apr 21, 2024 at 5:04 AM ... At least leveraging the code in the API makes sense to me. Not sure it's worth creating a service that just translates between the APIs, but it would be cool and might get us to at least improve the online documentation around our API. I heard at last year's NAMA foray that ChatGPT is better at creating code using the iNat API than the MO API. I think Alan has looked at that and might have some examples.

On Sun, Apr 21, 2024 at 12:19 AM Jason Hollinger [pellaea@gmail.com](https://mail.google.com/mail/?view=cm&fs=1&tf=1&to=pellaea@gmail.com) wrote: Yes, check out around line 58 of app/classes/api2/core/uploads.rb for one possible solution. It might be altogether a dumb idea to literally just use the API to accomplish your whole task.

nimmolo commented 3 months ago

Good to know, i was wondering the same thing. We use File class methods all over MO.

In the stack overflow answer:

I'd go after the file using Ruby's Open::URI:

require "open-uri"

File.open('pie.png', 'wb') do |fo| fo.write open("http://chart.googleapis.com/chart?#{failures_url}").read end

I wondered, how is he using Open::URI here, though? The answer seems to be that any URL that begins with protocol:// protocol://%60 is parsed by URI if File.open is called.

On May 3, 2024, at 1:33 PM, Nathan Wilson @.***> wrote:

How do I copy an image from an external website to a (temp) file on the server?

Nathan Wilson @. @.> Sun, Apr 21, 2024 at 5:04 AM ... At least leveraging the code in the API makes sense to me. Not sure it's worth creating a service that just translates between the APIs, but it would be cool and might get us to at least improve the online documentation around our API. I heard at last year's NAMA foray that ChatGPT is better at creating code using the iNat API than the MO API. I think Alan has looked at that and might have some examples.

On Sun, Apr 21, 2024 at 12:19 AM Jason Hollinger @. @.>> wrote: Yes, check out around line 58 of app/classes/api2/core/uploads.rb for one possible solution. It might be altogether a dumb idea to literally just use the API to accomplish your whole task.

I'd either use system to call one of the standard command line tools for this (curl or wget) or I'd explore a Ruby library like Open::URI http://rubydoc.info/stdlib/open-uri/1.9.2/frames. See https://stackoverflow.com/questions/6768238/download-an-image-from-a-url https://stackoverflow.com/questions/6768238/download-an-image-from-a-url — Reply to this email directly, view it on GitHub https://github.com/MushroomObserver/mushroom-observer/issues/1955#issuecomment-2093724355, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAO3TPY5QZBM4JSQZSON2JLZAPYCRAVCNFSM6AAAAABDV2T2AGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAOJTG4ZDIMZVGU. You are receiving this because you are subscribed to this thread.

JoeCohen commented 3 months ago

From README_API.md

The response will include the id of the new record.

Attach the image as POST data or URL. See script/test_api for an example of how to attach an image in the POST data.

There is no script/test_api. Maybe test/controllers/api2_controller_test.rb#test_post_maximal_image?

  def test_post_maximal_image
    setup_image_dirs
    rolf.update(keep_filenames: "keep_and_show")
    rolf.reload
    file = Rails.root.join("test/images/Coprinus_comatus.jpg").to_s
    proj = rolf.projects_member.first
    obs = rolf.observations.first
    File.stub(:rename, false) do
      post_and_send_file(:images, file, "image/jpeg",
                         api_key: api_keys(:rolfs_api_key).key,
                         vote: "3",
                         date: "20120626",
                         notes: " Here are some notes. ",
                         copyright_holder: "My Friend",
                         license: licenses(:ccnc30).id.to_s,
                         original_name: "Coprinus_comatus.jpg",
                         projects: proj.id,
                         observations: obs.id)
    end
    ...
  end

  def post_and_send_file(action, file, content_type, params)
    body = Rack::Test::UploadedFile.new(file, "image/jpeg").read
    md5sum = file_checksum(file)
    post_and_send(action, body, content_type, md5sum, params)
  end

  def post_and_send(action, body, content_type, md5sum, params)
    @request.env["CONTENT_TYPE"] = content_type
    @request.env["CONTENT_MD5"] = md5sum
    post(action, params: params, body: body)
  end

Better: test/models/api2_test.rb#test_posting_image_via_url

  def test_posting_image_via_url
    setup_image_dirs
    url = "https://mushroomobserver.org/images/thumb/459340.jpg"
    stub_request(:any, url).
      to_return(Rails.root.join("test/images/test_image.curl").read)
    params = {
      method: :post,
      action: :image,
      api_key: @api_key.key,
      upload_url: url
    }
    File.stub(:rename, false) do
      api = API2.execute(params)
      assert_no_errors(api, "Errors while posting image")
      img = Image.last
      assert_obj_arrays_equal([img], api.results)
      actual = File.read(img.local_file_name(:full_size))
      expect = Rails.root.join("test/images/test_image.jpg").read
      assert_equal(expect, actual, "Uploaded image differs from original!")
    end
  end
JoeCohen commented 3 months ago

Following is returns by API help request http://localhost:3000//api2/images?help=1

{
"version": 2,
"run_date": "2024-05-04T11:40:53.075Z",
"errors": [
{
"code": "API2::HelpMessage",
"details": "Usage: confidence: confidence range (limit=-3..3); content_type: enum list (limit=bmp|gif|jpg|png|raw|tiff); copyright_holder_has: string (search within copyright holder); created_at: time range; date: date range (when photo taken); has_notes: boolean; has_observation: boolean (limit=true, is attached to an observation?); has_votes: boolean; id: integer list; include_subtaxa: boolean; include_synonyms: boolean; license: license; location: location list; name: name list; notes_has: string (search within notes); observation: observation list; ok_for_export: boolean; project: project list; quality: quality range (limit=1..4); size: enum (limit=huge|large|medium|small|thumbnail, width or height at least 160 for thumbnail, 320 for small, 640 for medium, 960 for large, 1280 for huge); species_list: species_list list; updated_at: time range; user: user list (who uploaded the photo)",
"fatal": "true"
}
],
"run_time": 0.017291
}
JoeCohen commented 2 months ago

Is iNat Obs in Fungi?

Taxa page for Fungi: https://www.inaturalist.org/taxa/47170-Fungi_ So [:taxon][:ancestor_ids]must include 47170 alternative: [:taxon][:iconic_taxon_name] == "Fungi"

JoeCohen commented 2 months ago

iNat Projects Issue.

I cannot reliably get an iNat observation's Projects via the new API. Ex: iNat 216745568 shows many projects via the UI. But the corresponding fixture has nothing. See test/fixtures/inat/gyromitra_ancilis.txt

The old API might work. https://www.inaturalist.org/pages/api+reference#get-observations Also see this CoPIlot response (about the UI):

As of now, iNaturalist provides a way to see which collection or umbrella projects include an observation. However, this feature is not available for traditional projects. If you're interested in finding out which projects an observation belongs to, here's how you can do it:

  1. Collection and Umbrella Projects:

    • Starting from the observation details page, you can now see the collection and umbrella projects that include an observation. This feature was added based on user requests. Simply navigate to the observation detail page, and you'll find the relevant information there.
  2. Traditional Projects:

    • Unfortunately, for traditional projects, there isn't a direct way to see which projects an observation qualifies for. The information about whether an observation belongs to a traditional project is not readily available in the observation data or CSV downloads.
    • If you're interested in this feature, consider raising it as a feature request on the iNaturalist platform. It would be valuable for users to know which projects their observations could potentially be part of.

Remember that iNaturalist is continually evolving, so keep an eye out for any updates or new features that might address this need! 🌿🔍

Source: Conversation with Copilot, 6/13/2024 (1) Updates to collection and umbrella projects · iNaturalist. https://www.inaturalist.org/blog/18375-updates-to-collection-and-umbrella-projects. (2) Find out to which projects an observation has been added. https://forum.inaturalist.org/t/find-out-to-which-projects-an-observation-has-been-added/3236. (3) Adding Observations to a Traditional Project - iNaturalist Community Forum. https://forum.inaturalist.org/t/adding-observations-to-a-traditional-project-wiki/13190.

JoeCohen commented 1 month ago

for contains_box, I need to have a (separate?) scope which generalizes this (from #2183)

  scope :contains, # Use named parameters (lat:, lng:), any order
        lambda { |**args|
          args => {lat:, lng:}
          where(
            Location[:south].lteq(lat).and(Location[:north].gteq(lat)).
            and(
              Location[:west].lteq(lng).and(Location[:east].gteq(lng)).or(
                Location[:west].gteq(lng).and(Location[:east].lteq(lng))
              )
            )
          )
        }
nimmolo commented 1 month ago

How about something like this, that takes advantage of named arg assignment:

  scope :contains, # Use named parameters (lat:, lng:, or north:, south:, east:, west:), any order
        lambda { |**args|
          if args.lat.present? && args.lng.present?
            args => {lat:, lng:}
            north = south = lat
            east = west = lng
          else if args.north.present?
            args => {north:, south:, east:, west:}
          end
          where(
            Location[:south].lteq(south).and(Location[:north].gteq(north)).
            and(
              Location[:west].lteq(west).and(Location[:east].gteq(east)).or(
                Location[:west].gteq(west).and(Location[:east].lteq(east))
              )
            )
          )
        }

Not sure about all that, but something like that.

JoeCohen commented 1 month ago

@nimmolo Yes. That looks right. Needs some tests, especially for e/w. What's the best procedure for adding this:

nimmolo commented 1 month ago

I'd say do it as a standalone PR vs main. We have plenty of scopes as yet unused, and one more won't hurt — plus of course I'll use the lat/lng part in my next PR.

JoeCohen commented 1 month ago

Thanks. Will do.

On Fri, Jun 28, 2024 at 4:24 PM andrew nimmo @.***> wrote:

I'd say do it as a standalone PR vs main. We have plenty of scopes as yet unused, and one more won't hurt — plus of course I'll use the lat/lng part in my next PR.

— Reply to this email directly, view it on GitHub https://github.com/MushroomObserver/mushroom-observer/issues/1955#issuecomment-2197765438, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAALDFF76ZJJ3EEWBG7L3ZDZJXWEFAVCNFSM6AAAAABDV2T2AGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCOJXG43DKNBTHA . You are receiving this because you were assigned.Message ID: @.***>

JoeCohen commented 1 month ago

iNat API Recommended Practices